<html>
<head>
  <meta http-equiv="Content-Type" content="text/html">
  <style> pre {color: navy} tt {color: maroon} </style>
  <style> table {border-collapse: separate; border-spacing: 0px; empty-cells: show; background-color: #f0f0ff} </style>
  <style> th, td {text-align: left; padding-left: 15px; padding-right: 15px} </style>
</head>
<body>
<div align="center">
<h1>MCPP-MANUAL</h1>
<h2>== How to Use MCPP ==</h2>
</div>
<div align="right">
<h4>for V.2.7.2 (2008/11)<br>
Kiyoshi Matsui (kmatsui@t3.rim.or.jp)</h4>
</div>
<div align="center">
<h2>Contents</h2>
</div>
<dl><dt><a name="toc.1" href="#1">1. Overview</a>
<dd><a name="toc.1.1" href="#1.1">1.1. High Portability</a>
<dd><a name="toc.1.2" href="#1.2">1.2. Standard C Mode with Highest Conformance and Other Modes</a>
<dd><a name="toc.1.3" href="#1.3">1.3. Notations in this Manual</a>
<br>
<br>
<dt><a name="toc.2" href="#2">2. Invocation Options and Environment Settings</a>
<dd><a name="toc.2.1" href="#2.1">2.1. Two Kinds of Build and Five Behavioral Modes</a>
<dd><a name="toc.2.2" href="#2.2">2.2. How to Specify Invocation Options</a>
<dd><a name="toc.2.3" href="#2.3">2.3. Common Options</a>
<dd><a name="toc.2.4" href="#2.4">2.4. Options by <b>mcpp</b> Behavioral Modes</a>
<dd><a name="toc.2.5" href="#2.5">2.5. Common Options Except for Some Compiler Systems</a>
<dd><a name="toc.2.6" href="#2.6">2.6. Options by Compiler System</a>
<dd><a name="toc.2.7" href="#2.7">2.7. Environment Variables</a>
<dd><a name="toc.2.8" href="#2.8">2.8. Multi-Byte Character Encodings</a>
<dd><a name="toc.2.9" href="#2.9">2.9. How to Use <b>mcpp</b> in One-Pass Compilers</a>
<dd><dl><dt><a name="toc.2.10" href="#2.10">2.10. How to Use <b>mcpp</b> in IDE</a>
<dd><a name="toc.2.10.1" href="#2.10.1">2.10.1. How to Make <b>mcpp</b> available in Visual C++ IDE</a>
<dd><a name="toc.2.10.2" href="#2.10.2">2.10.2. How to Make <b>mcpp</b> available in Mac OS X / Xcode.app</a></dl>
<br>
<dt><a name="toc.3" href="#3">3. Enhancement and Compatibility</a>
<dd><dl><dt><a name="toc.3.1" href="#3.1">3.1. #pragma MCPP put_defines, #pragma MCPP preprocess and others</a>
<dd><a name="toc.3.1.1" href="#3.1.1">3.1.1. Pre-preprocessing of Header File</a></dl>
<dd><dl><dt><a name="toc.3.2" href="#3.2">3.2. #pragma once</a>
<dd><a name="toc.3.2.1" href="#3.2.1">3.2.1. Tool to Write #pragma once to Header Files</a></dl>
<dd><a name="toc.3.3" href="#3.3">3.3. #pragma MCPP warning, #include_next, #warning</a>
<dd><a name="toc.3.4" href="#3.4">3.4. #pragma MCPP push_macro, #pragma __setlocale and others</a>
<dd><dl><dt><a name="toc.3.5" href="#3.5">3.5. #pragma MCPP debug, #pragma MCPP end_debug, #debug, #end_debug</a>
<dd><a name="toc.3.5.1" href="#3.5.1">3.5.1. #pragma MCPP debug path, #debug path</a>
<dd><a name="toc.3.5.2" href="#3.5.2">3.5.2. #pragma MCPP debug token, #debug token</a>
<dd><a name="toc.3.5.3" href="#3.5.3">3.5.3. #pragma MCPP debug expand, #debug expand</a>
<dd><a name="toc.3.5.4" href="#3.5.4">3.5.4. #pragma MCPP debug if, #debug if</a>
<dd><a name="toc.3.5.5" href="#3.5.5">3.5.5. #pragma MCPP debug expression, #debug expression</a>
<dd><a name="toc.3.5.6" href="#3.5.6">3.5.6. #pragma MCPP debug getc, #debug getc</a>
<dd><a name="toc.3.5.7" href="#3.5.7">3.5.7. #pragma MCPP debug memory, #debug memory</a>
<dd><a name="toc.3.5.8" href="#3.5.8">3.5.8. #pragma MCPP debug macro_call</a></dl>
<dd><a name="toc.3.6" href="#3.6">3.6. #assert, #asm, #endasm</a>
<dd><a name="toc.3.7" href="#3.7">3.7. New C99 Features (_Pragma() operator, Variadic Macro and others)</a>
<dd><dl><dt><a name="toc.3.8" href="#3.8">3.8. Particular specifications for certain compiler system</a>
<dd><a name="toc.3.8.1" href="#3.8.1">3.8.1. Variadic macro of GCC and Visual C</a>
<dd><a name="toc.3.8.2" href="#3.8.2">3.8.2. Handling of 'defined' by GCC</a>
<dd><a name="toc.3.8.3" href="#3.8.3">3.8.3. Asm Statement in Borland C and Other Special Syntaxes</a>
<dd><a name="toc.3.8.4" href="#3.8.4">3.8.4. #import and Others</a></dl>
<dd><dl><dt><a name="toc.3.9" href="#3.9">3.9. Problems of GCC and Compatibility with GCC</a>
<dd><a name="toc.3.9.1" href="#3.9.1">3.9.1. Preprocessing FreeBSD 2/Kernel Sources</a>
<dd><a name="toc.3.9.2" href="#3.9.2">3.9.2. Preprocessing FreeBSD 2/Libc</a>
<dd><a name="toc.3.9.3" href="#3.9.3">3.9.3. Problems Concerning GCC 2/cpp</a>
<dd><a name="toc.3.9.4" href="#3.9.4">3.9.4. Preprocessing Linux/glibc 2.1</a>
<dd><dl><dt><a name="toc.3.9.5" href="#3.9.5">3.9.5. To Use <b>mcpp</b> with GCC 2</a>
<dd><a name="toc.3.9.5.1" href="#3.9.5.1">3.9.5.1. To Sort <b>mcpp</b>'s Warnings</a></dl>
<dd><a name="toc.3.9.6" href="#3.9.6">3.9.6. Preprocessing GCC 3.2 Source</a>
<dd><a name="toc.3.9.7" href="#3.9.7">3.9.7. To Use <b>mcpp</b> with GCC 3 or 4</a>
<dd><a name="toc.3.9.8" href="#3.9.8">3.9.8. Preprocessing Linux/glibc 2.4</a>
<dd><a name="toc.3.9.9" href="#3.9.9">3.9.9. The Problems of Linux / stddef.h, limits.h and #include_next</a>
<dd><a name="toc.3.9.10" href="#3.9.10">3.9.10. Problems of Mac OS X / Apple-GCC and its System Headers</a>
<dd><a name="toc.3.9.11" href="#3.9.11">3.9.11. Preprocessing firefox 3.0b3pre</a></dl>
<dd><dl><dt><a name="toc.3.10" href="#3.10">3.10. Visual C++ System Header Problems</a>
<dd><a name="toc.3.10.1" href="#3.10.1">3.10.1. Comment Generating Macro?</a>
<dd><a name="toc.3.10.2" href="#3.10.2">3.10.2. '$' in Identifiers</a></dl>
<br>
<dt><a name="toc.4" href="#4">4. Implementation-defined Behaviors</a>
<dd><a name="toc.4.1" href="#4.1">4.1. Status Value on Exit</a>
<dd><a name="toc.4.2" href="#4.2">4.2. Include Directory Search Path</a>
<dd><a name="toc.4.3" href="#4.3">4.3. How to Construct Header Name</a>
<dd><a name="toc.4.4" href="#4.4">4.4. Evaluation of #if Expression</a>
<dd><a name="toc.4.5" href="#4.5">4.5. Character Constant Evaluation in #if Expression</a>
<dd><a name="toc.4.6" href="#4.6">4.6. #if sizeof (type)</a>
<dd><a name="toc.4.7" href="#4.7">4.7. How to Handle White-Space Sequence</a>
<dd><a name="toc.4.8" href="#4.8">4.8. Default Specifications for <b>mcpp</b> Executables</a>
<br>
<br>
<dt><a name="toc.5" href="#5">5. Diagnostic Messages</a>
<dd><a name="toc.5.1" href="#5.1">5.1. Diagnostic Messages Format</a>
<dd><a name="toc.5.2" href="#5.2">5.2. Translation Limits</a>
<dd><dl><dt><a name="toc.5.3" href="#5.3">5.3. Fatal Errors</a>
<dd><a name="toc.5.3.1" href="#5.3.1">5.3.1. <b>mcpp</b>'s Own Bugs</a>
<dd><a name="toc.5.3.2" href="#5.3.2">5.3.2. Physical Errors</a>
<dd><a name="toc.5.3.3" href="#5.3.3">5.3.3. Translation Limits and Internal Buffer Errors</a>
<dd><a name="toc.5.3.4" href="#5.3.4">5.3.4. #pragma MCPP preprocessed Related Errors</a></dl>
<dd><dl><dt><a name="toc.5.4" href="#5.4">5.4. Errors</a>
<dd><a name="toc.5.4.1" href="#5.4.1">5.4.1. Character and Token Related Errors</a>
<dd><a name="toc.5.4.2" href="#5.4.2">5.4.2. Unterminated Source File Related Errors</a>
<dd><a name="toc.5.4.3" href="#5.4.3">5.4.3. Ill-Balanced Preprocessing Group Related Errors</a>
<dd><a name="toc.5.4.4" href="#5.4.4">5.4.4. Simple Syntax Errors on Directive Lines</a>
<dd><a name="toc.5.4.5" href="#5.4.5">5.4.5. Syntax Errors in #if Expressions</a>
<dd><a name="toc.5.4.6" href="#5.4.6">5.4.6. #if Expression Evaluation Errors</a>
<dd><a name="toc.5.4.7" href="#5.4.7">5.4.7. #define Related Errors</a>
<dd><a name="toc.5.4.8" href="#5.4.8">5.4.8. #undef Related Errors</a>
<dd><a name="toc.5.4.9" href="#5.4.9">5.4.9. Macro Expansion Errors</a>
<dd><a name="toc.5.4.10" href="#5.4.10">5.4.10. #error and #assert</a>
<dd><a name="toc.5.4.11" href="#5.4.11">5.4.11. Failure of #include</a>
<dd><a name="toc.5.4.12" href="#5.4.12">5.4.12. Other Errors</a></dl>
<dd><dl><dt><a name="toc.5.5" href="#5.5">5.5. Warnings (Class 1)</a>
<dd><a name="toc.5.5.1" href="#5.5.1">5.5.1. Character, Token and Comment Related Warnings</a>
<dd><a name="toc.5.5.2" href="#5.5.2">5.5.2. Unterminated Source File Related Warnings</a>
<dd><a name="toc.5.5.3" href="#5.5.3">5.5.3. Directive Line Related Warnings</a>
<dd><a name="toc.5.5.4" href="#5.5.4">5.5.4. #if Expression Related Warnings</a>
<dd><a name="toc.5.5.5" href="#5.5.5">5.5.5. Macro Expansion Related Warnings</a>
<dd><a name="toc.5.5.6" href="#5.5.6">5.5.6. Line Number Related Warnings</a>
<dd><a name="toc.5.5.7" href="#5.5.7">5.5.7. #pragma MCPP warning, #warning</a></dl>
<dd><a name="toc.5.6" href="#5.6">5.6. Warnings (Class 2)</a>
<dd><a name="toc.5.7" href="#5.7">5.7. Warnings (Class 4)</a>
<dd><a name="toc.5.8" href="#5.8">5.8. Warnings (Class 8)</a>
<dd><a name="toc.5.9" href="#5.9">5.9. Warnings (Class 16)</a>
<dd><a name="toc.5.10" href="#5.10">5.10. Diagnostic Messages Index</a>
<br>
<br>
<dt><a name="toc.6" href="#6">6. Reporting on Bugs and Others</a>
</dl>
<br>

<h1><a name="1" href="#toc.1">1. Overview</a></h1>
<p><b>mcpp</b> is a C preprocessor developed by kmatsui (Kiyoshi Matsui) based on DECUS cpp written by Martin Minow, and then rewritten entirely.  <b>mcpp</b> means Matsui cpp.  This software is supplied as source codes, and to use <b>mcpp</b> in any compiler systems, a small amount of modifications to adapt to the compiler system are required before it can be compiled into an executable. *1</p>
<p>This document describes the specification for <b>mcpp</b> executables that has been already ported to certain compiler systems.  For those who want to know more about <b>mcpp</b> or want to port it to other compiler systems, refer to <b>mcpp</b> source and its document <a href="mcpp-porting.html">mcpp-porting.html</a>.</p>
<p>All these sources and related documents are provided as an open-source-software.</p>
<p>Before going into detail, some of the <b>mcpp</b> features are introduced here.</p>
<p>Note:</p>
<p>*1 <b>mcpp</b> V.2.6.3 onward provides some binary packages too, at the following site.</p>
<blockquote>
<p><a href="http://mcpp.sourceforge.net/">http://mcpp.sourceforge.net/</a></p>
</blockquote>
<br>

<h2><a name="1.1" href="#toc.1.1">1.1. High portability</a></h2>
<p><b>mcpp</b> is a portable preprocessor, supporting various operating systems, including Linux, FreeBSD and Windows.  Its source has a wide portability, and can be compiled by any compilers which support Standard C or C++ (ANSI/ISO C or C++).  The library functions used are only the classic ones.</p>
<p>To port <b>mcpp</b> to each compiler system, in many cases, one only needs to change some macro definitions in the header files and simply compile it.  In the worst case, adding several dozen of lines into a source file would be enough.</p>
<p>To process multi-byte characters (Kanji), it supports Japanese EUC-JP, shift-JIS and ISO2022-JP, Chinese GB-2312, Taiwanese Big-5 and Korean KSC-5601 (KSX 1001), as well as UTF-8.  For shift-JIS, ISO2022-JP or Big-5, <b>mcpp</b> can complement the compiler-proper if it does not recognize them.</p>
<br>

<h2><a name="1.2" href="#toc.1.2">1.2. Standard C mode with highest conformance and other modes</a></h2>
<p><b>mcpp</b> has various behavioral modes.  Other than Standard-conforming mode, there are K&amp;R 1st mode, "Reiser" cpp mode and what I call post-Standard mode.  <b>mcpp</b> has also an execution option for C++ preprocessor.</p>
<p>Different from many existing preprocessors, Standard mode of <b>mcpp</b> has the highest conformance to Standards: all of C90, C99 and C++98.  It has been developed aiming to become the reference model of the Standard C preprocessor.  Those versions of the Standard can be specified by an execution option. *1</p>
<p>In addition, it provides several useful enhancements: '#pragma MCPP debug', which traces the process of macro expansion or #if expression evaluation, and the header file "pre-preprocessing" facility.</p>
<p><b>mcpp</b> also provides several useful execution options, such as warning level or include directory specification options.</p>
<p>Even if there are any mistakes in the source, <b>mcpp</b> deals suitably with accurate plain diagnostic messages without running out of control or displaying misguiding error messages.  It also displays warnings for portability problems.  The detailed documents are also attached.</p>
<p>In spite of its high quality, <b>mcpp</b>'s code size and memory usage is relatively small.</p>
<p>A disadvantage of <b>mcpp</b>, if any, is slower processing speed.  It takes two or three times time of GCC 3.*, 4.* / cc1, but seeing that its processing speed is almost the same as that of Borland C 5.5/cpp32 and that it runs a little bit faster when the header file pre-preprocessing facility is used, it cannot be described as particularly slow.  <b>mcpp</b> puts an emphasis on standard conformance, source portability and operability in a small memory space, making this level of processing speed inevitable.</p>
<p>Validation Suite for Standard C Preprocessing, which is used to test the extent to which a preprocessor conforms to Standard C, its documentation cpp-test.html, which contains results of applying Validation Suite to various preprocessors, are also released with <b>mcpp</b>.  When looking through this file, you will notice that so-called Standard-conforming preprocessors have so many conformance-related problems.</p>
<p>During the course of developing <b>mcpp</b> V.2.3, it was selected as one of the "Exploratory Software Projects for 2002" by Information-technology Promotion Agency (IPA), Japan, along with its Validation Suite.  From July 2002 to February 2003, the project, financed by IPA, proceeded under advice of Yutaka Niibe project manager.  I asked "HighWell, Inc." Limited Company, Tokyo, for translation of all the documents.  For technical details, I revised and corrected the translated documents.</p>
<p><b>mcpp</b> was continuously adopted to one of the "Exploratory Software Projects" in 2003 by Hiroshi Ichiji project manager.  The update of <b>mcpp</b> proceeded into the next version, V.2.4. *2</p>
<p>After the project, I am still going on updating <b>mcpp</b> and Validation Suite.</p>
<p>Note:</p>
<p>*1 ISO/IEC 9899:1990 (JIS X 3010-1993) had been used as C Standard, but in 1999, ISO/IEC 9899:1999 was adopted as a new Standard.  This document calls the former C90 and latter C99.  The former is generally called ANSI C or C89 because it migrated from ANSI X3.159-1989.  ISO/IEC 9899:1990 + Amendment 1995 is sometimes called C95.  C++ Standards are ISO/IEC 14882:1998 and its corrigendum version ISO/IEC 14882:2003.  This document calls both of them C++98.</p>
<p>*2 The outline of the "Exploratory Software Project" can be seen at the following site (Japanese only).</p>
<blockquote>
<p><a href="http://www.ipa.go.jp/jinzai/esp/"> http://www.ipa.go.jp/jinzai/esp/</a></p>
</blockquote>
<p><b>mcpp</b> from V.2.3 through V.2.5 had been located at:</p>
<blockquote>
<p><a href="http://www.m17n.org/mcpp/"> http://www.m17n.org/mcpp/</a></p>
</blockquote>
<p>In April 2006, <b>mcpp</b> project moved to:</p>
<blockquote>
<p><a href="http://mcpp.sourceforge.net/"> http://mcpp.sourceforge.net/</a></p>
</blockquote>
<p>The old version of <b>mcpp</b>, cpp V.2.2 and Validation Suite V.1.2 are located in the following Vector's web site.  They are in the directory called dos/prog/c, but they are not for MS-DOS exclusively.  Sources are for UNIX, WIN32, MS-DOS.  The documents are Japanese only.</p>
<blockquote>
<a href="http://www.vector.co.jp/soft/dos/prog/se081188.html">http://www.vector.co.jp/soft/dos/prog/se081188.html</a><br>
<a href="http://www.vector.co.jp/soft/dos/prog/se081189.html">http://www.vector.co.jp/soft/dos/prog/se081189.html</a><br>
<a href="http://www.vector.co.jp/soft/dos/prog/se081186.html">http://www.vector.co.jp/soft/dos/prog/se081186.html</a><br>
</blockquote>
<p>The text files in these archive files available at Vector use [CR]+[LF] as a &lt;newline&gt; and encode Kanji in shift-JIS for DOS/Windows.  On the other hand, those from V.2.3 through V.2.5 available at SourceForge use [LF] as a &lt;newline&gt; and encode Kanji in EUC-JP for UNIX.  From V.2.6 on two types of archive, .tar.gz file with [LF]/EUC-JP and .zip file with [CR]+[LF]/shift-JIS, are provided.</p>
<br>

<h2><a name="1.3" href="#toc.1.3">1.3. Notations in this Manual</a></h2>
<p>Though this manual was text-file in the older versions, it has changed to html-file at V.2.6.2.<br> 
This manual uses the following typographical conventions:</p>
<ul>
<li><tt style="color:navy">source:</tt><br>
<tt style="color:navy">Navy</tt> colored constant-width font is used for code snippets and command line inputs.<br>
<li><tt>__STDC__:</tt><br>
<tt style="color:maroon">Maroon</tt> colored constant-width font is used for Standard predefined macros or any other macros found in some codes.<br>
<li><i>STD</i>:<br>
<i>Italic</i> font is used for the macros defined in <b>mcpp</b> source file named system.H.  This manual uses these names to denote various <b>mcpp</b> settings.  Note that these macros are only used in compilation of <b>mcpp</b>, and that the <b>mcpp</b> executable does not have such macros.<br>
</ul>
<br>

<h1><a name="2" href="#toc.2">2. Invocation Options and Environment Settings</a></h1>

<h2><a name="2.1" href="#toc.2.1">2.1. Two Kinds of Build and Five Behavioral Modes</a></h2>
<p>There are two types of build (or compiling configuration) for <b>mcpp</b> executable. *1, *2</p>
<ol>
<li><b>Compiler-independent-build</b>: the preprocessor which behaves on its own not depending on compiler system.  The invocation options of compiler-independent-build are the same across the compilers with which <b>mcpp</b> is compiled.  Although it can preprocess source files, it cannot behave as an integrated part of a compiler system.<br>
<li><b>Compiler-specific-build</b>: the preprocessor to replace the resident preprocessor of certain compiler system, if possible.  Each compiler-specific-build has some different specifications for compatibility with the compiler system.  It has the options common with the compiler-independent-build, except a few options different from the commons to avoid conflict with the compiler system.<br>
</ol>
<p>Each <b>mcpp</b> executable has following 5 behavioral modes regardless of the building types.</p>
<ol>
<li><i>STD</i>: Standards (C90, C99, C++98) conforming mode.  This is the default.<br>
<li><i>COMPAT</i>: A variation of <i>STD</i> mode, which expands recursive macro more than the Standards' specification.<br>
<li><i>POSTSTD</i>: Special "post-Standard" mode created by the author, based on the Standards and simplified removing all the Standards irregular specifications.<br>
<li><i>KR</i>: K&amp;R 1st specification mode.<br>
<li><i>OLDPREP</i>: "Reiser" model cpp mode (old-preprocessor mode).<br>
</ol>
<p>The mode of <b>mcpp</b> is specified by the run-time options as follows:</p>
<ul>
<li><samp>-@std</samp><br>
The <i>STD</i> mode (default).<br>
<li><samp>-@compat</samp><br>
The <i>COMPAT</i> mode.<br>
<li><samp>-@poststd, -@post</samp><br>
The <i>POSTSTD</i> mode.<br>
<li><samp>-@kr</samp><br>
The <i>KR</i> mode.<br>
<li><samp>-@oldprep, -@old</samp><br>
The <i>OLDPREP</i> mode.<br>
</ul>
<p>In this document, I group <i>OLDPREP</i> and <i>KR</i> into pre-Standard modes, and group <i>STD</i>, <i>COMPAT</i> and <i>POSTSTD</i> into Standard modes.  Since <i>COMPAT</i> mode is almost the same with <i>STD</i> mode, <i>STD</i> includes <i>COMPAT</i> unless otherwise mentioned. *3</p>
<p>There are differences in the macro expansion methods between Standard and pre-Standard modes.  Roughly speaking, this difference is the difference between C90 and pre-C90.  The biggest difference is the expansion of the function-like macros (macros with arguments).  For the arguments with macros, while in Standard mode, <b>mcpp</b> substitutes the parameter within the replacement list of the original macro after completely expanding the arguments, in pre-Standard, <b>mcpp</b> substitutes the parameter without expanding, then expands the argument at rescan time.</p>
<p>Also, in Standard mode, a macro is not expanded recursively in principle, even if the macro definition is recursive directly or indirectly.  If there is a recursive macro definition in pre-Standard mode, it causes infinite recursion and becomes an error at expansion time.</p>
<p>Handling of \ at line end is also different by mode.  In Standard mode, after processing the trigraph, the sequence of &lt;backslash&gt;&lt;newline&gt; gets deleted before tokenization, but in pre-Standard mode, these only get deleted when they are within the string literals or in a #define line.</p>
<p>There is a subtle difference in tokenization (token parsing, decomposition to tokens).  In Standard mode, it tokenizes on "token based processing" principle.  To put it concretely, in Standard mode, spaces will be inserted surrounding the expanded macro to prevent the unexpected merging with its adjacent tokens.  In pre-Standard mode, traditional, convenient and tacit tokenization and the macro expansion methods of "character based text replacement" are left a trace.  About these, please see <a href="cpp-test.html#2"> cpp-test.html#2</a>.</p>
<p>In Standard mode, it handles the numeric token, called preprocessing number, according to the Standard specification.  In pre-Standard, the numeric tokens are the same as integer constant tokens or floating point tokens.  The suffix 'U', 'u', 'LL' and 'll' of the integer constant and the suffixes 'F', 'f', 'L' and 'l' of floating point are not recognized as a part of the tokens in pre-Standard.</p>
<p>The string literals and character constants of wide characters are recognized as single tokens only in Standard mode.</p>
<p>Digraph, #error, #pragma, and _Pragma() operator are available only in Standard mode.  Also, -S &lt;n&gt; option (strict-ansi specs) and -+ option (the one run as C++ preprocessor) are used only in Standard mode.  Pre-defined macros <tt>__STDC__</tt>, <tt>__STDC_VERSION__</tt> are defined in Standard mode, and they don't get defined in pre-Standard.</p>
<p>#if defined, #elif cannot be used in pre-Standard mode.  Macros cannot be used within argument of #include or #line in pre-Standard.  Predefined macros, <tt>__FILE__</tt>, <tt>__LINE__</tt>, <tt>__DATE__</tt>, <tt>__TIME__</tt> are not defined at pre-Standard.</p>
<p>On the other hand, #assert, #asm (#endasm), #put_defines and #debug are available in pre-Standard mode only.</p>
<p>#if expression is evaluated in long / unsigned long or long long / unsigned long long at Standard mode, and in (signed) long only at pre-Standard.  sizeof (type) in #if expression can be used only in pre-Standard.</p>
<p>Trigraphs and UCN (universal character name) are available only in <i>STD</i> mode.</p>
<p>The output of diagnostic messages is also slightly different between the modes.  Please see chapter <a href="#5"> 5</a> for details.</p>
<p>Any other items, which do not have any distinct rules between K&amp;R 1st and the Standards, follow the C90 rules in pre-Standard mode.</p>
<p>The difference of <i>OLDPREP</i> mode from <i>KR</i> mode and the difference of <i>POSTSTD</i> and <i>COMPAT</i> modes from <i>STD</i> mode are as follows:</p>
<ul>
<li><i>OLDPREP</i><br>
<ol>
<li>Convert comment to 0 space instead of 1 space.  Usually this conversion is done in the output at the end.  In macro definition, however, the conversion is done immediately after the definition.<br>
<li>When there are string literals or character constants in the replacement list of the macro definition, and if any of the parameter names match to any part of these, that part will be substituted with the argument corresponding to the parameter when calling the macro.  That is to say, when the content of the string literal or character constant is searched as token sequence, stripping the enclosing quotes, if a parameter name is found, that will be substituted.<br>
<li>You can write anything you like in the lines of #else, #endif. (One usually writes MACRO of corresponding #if MACRO or #ifdef MACRO.)<br>
<li>It stops "unterminated string literal" and "unterminated character constant" errors.  If there is no closure of the literal " or ', it assumes the close at line end.<br>
<li>It treats '# 123' line as '#line 123'.<br>
</ol>
<br>
<li><i>COMPAT</i><br>
Expand recursive macro more than the Standard's specification.  On expanding recursive macro, set the range of non-re-replacing of the same name narrower than the Standard.<br>
<br>
Refer to <a href="cpp-test.html#3.4.26"> cpp-test.html#3.4.26</a> about the specifications of recursive macro expansion.  See test-t/recurs.t for a sample of recursive macro. *4<br>
<br>
<li><i>POSTSTD</i><br>
This mode differs from <i>STD</i> mode in the following points:<br>
<ol>
<br>
<li>Does not recognize trigraphs.  Digraphs are converted at translation phase 1, that is, the beginning of preprocessing.  Does not deal with digraph as a token.<br>
<li>Simplified tokenization according to complete token-base rule.  When there is no white space, as a token separator between preprocessing tokens in the source code, insert a space automatically.  (However, this does not get inserted between macro name and the following "(" within macro definition).  Therefore, even for stringizing by # operator, it gets stringized after a space is inserted between all the preprocessing tokens.  Also, at the re-definition of macros, it does not matter whether there is a token separator or not.<br>
<li>At the re-definition of function-like macros, the difference of the parameter name is not relevant.<br>
<li>Character constants cannot be used in #if expressions (it will cause an error).<br>
<li>It removed irregular "function-unlike" rules for function-like macro expansion.  Hence, rescanning only targets to the replacement list of the macro, and not the sequence after that.<br>
<li>Normally, the header name with the format of #include &lt;stdio.h&gt; is accepted, but it gets a warning. (by class 2 warning option.)  If the header name with the format of &lt;stdio.h&gt; is used in a macro, it can get an error at a particular instance.  It recommends to use the format of #include "stdio.h".<br>
<li>The rule, a space is required between macro name and replacement list in macro definition, is added in C99, but this rule is not complied with. (A space is inserted automatically at tokenization.)<br>
<li>UCN (universal-character-name) is not recognized.  Multi-byte characters in identifier are not recognized.<br>
<li>In C++, eleven identifier-like operators are not dealt as operators.<br>
</ol>
</ul>
<p>Moreover, there is a mode called <b>lang-asm</b>.
That is a mode to process anomalous sources which are assembler sources and nevertheless have comments, directives and macros of C embedded.
While <i>POST_STD</i> cannot become this mode, <i>STD, KR</i> and <i>OLD</i> get to this mode when specified by an option.
See <a href="#2.5">2.5</a> for its specifications.</p>
<p>For the above reasons, there are some different specifications in <b>mcpp</b> executables.  So, please read this manual carefully.  This chapter describes first the common options, next the behavioral-mode-dependent options, then the the options common to most compiler systems, finally the compiler-dependent options for each compiler-specific-build.</p>
<p>Note:</p>
<p>*1 There is another one named <b>subroutine-build</b> which is called as a subroutine from some other main program.  The behavioral specification of subroutine-build is, however, the same with either of compiler-specific-build or compiler-independent-build according to its compile time setting.  Hence, this manual does not mention subroutine-build particularly.
As for subroutine-build, refer to <a href="mcpp-porting.html#3.12">mcpp-porting.html#3.12</a>.</p>
<p>*2 The binary packages provided at the SourceForge site are of compiler-independent-builds.</p>
<p>*3 <b>mcpp</b> had two separate executables for Standard mode and pre-Standard mode; they were integrated into one at V.2.6.</p>
<p>*4 This option is for compatibility with GCC, Visual C++ and other major implementations.  'compat' means "compatible mode".</p>
<br>

<h2><a name="2.2" href="#toc.2.2">2.2. How to Specify Invocation Options</a></h2>
<p>The &lt;arg&gt; and [arg] shown below indicate required and optional arguments respectively.  Note that the &lt;,  &gt;,  [, or ] character itself must not be entered.</p>
<p><b>mcpp</b> invocation takes a form of:</p>
<pre>
mcpp [-&lt;opts&gt; [-&lt;opts&gt;]] [in_file] [out_file] [-&lt;opts&gt; [-&lt;opts&gt;]]
</pre>
<p>Note that you have to replace the above "mcpp" with other name, depending on how <b>mcpp</b> is installed.</p>
<p>When out_file (an output path) is omitted, stdout is used unless the -o option is specified.  When in_file (an input path) is omitted, stdin is used.  A diagnostic message is output to stderr unless the -Q option is specified.</p>
<p>If any of these files cannot be opened, preprocessing is terminated, issuing an error message.</p>
<p>For an option with argument, white-space characters may or may not be inserted between the option character and an argument.  In other words, both of "-I&lt;arg&gt;" and "-I &lt;arg&gt;" are acceptable.  For options without argument, both of "-Qi" and "-Q -i" are valid.</p>
<p>For an option with an argument, missing a required argument causes an error except for the -M option,</p>
<p>If -D, -U, -I, or -W option is specified multiple times, each of them is valid.  For -S, -V, or -+ option, only the first one is valid.  For -2, or -3 option, its specification switches each time an option is specified.  For other options, the last one is valid.</p>
<p>The option letters are case sensitive.</p>
<p>The switch character is '-', not '/', even under Windows.</p>
<p>When invalid options are specified, a usage statement is displayed.  To check valid options, enter a command, such as "mcpp -?".  In addition to the usage message, there are several error messages, but they are self-explanatory.  I will omit their explanations.</p>
<br>

<h2><a name="2.3" href="#toc.2.3">2.3. Common Options</a></h2>
<p>This section covers common options across <b>mcpp</b> modes or compiler systems.</p>
<ul>
<li><samp>-C</samp><br>
Output also comments in source code.  This option is useful for debugging.  Note that a comment is moved ahead of a logical source line when output.  This is because a comment is processed before macro expansion or directive processing, and a comment may appear during a macro invocation.<br>
<br>
<li><samp>-D &lt;macro&gt;[=[&lt;value&gt;]]</samp><br>
<li><samp>-D &lt;macro(a,b)&gt;[=[&lt;value&gt;]]</samp><br>
Define a macro named "macro".  This option can be used to change the definitions of predefined macros other than <tt>__STDC__</tt>, <tt>__STDC_VERSION__</tt>, <tt>__FILE__</tt>, <tt>__LINE__</tt>, <tt>__DATE__</tt>, <tt>__TIME__</tt> and <tt>__cplusplus</tt>. (<tt>__STDC_HOSTED__</tt>, C99's predefined macro, is exceptionally redefined by this option, because some compiler systems, like GCC V.3, use the -D option to define <tt>__STDC_HOSTED__</tt>.)  To specify a value, use "=&lt;value&gt;".  If "=&lt;value&gt;" is omitted, 1 is assumed. (Note that in bcc32, the macro is defined as zero-token by default.)  Do not enter white-space characters immediately before "=".  If a white-space character is entered immediately after "=", the macro is defined as zero token.<br>
A macro with arguments can be defined by this option.<br>
This option can be specified repeatedly.<br>
<br>
<li><samp>-e &lt;encoding&gt;</samp><br>
Change a multi-byte character encoding to &lt;encoding&gt;. For &lt;encoding&gt;, refer to <a href="#2.8"> 2.8</a>.<br>
<br>
<li><samp>-I &lt;directory&gt;</samp><br>
Specify the first directory in the include directory search path order with &lt;directory&gt;.  For a search path, refer to <a href="#4.2"> 4.2</a>.  If a directory name contains spaces, it has to be enclosed with " and ".<br>
<br>
<li><samp>-I 1, -I 2, -I 3</samp><br>
Specify a directory from which <b>mcpp</b> begins searching when it encounters a #include "header" directive (i.e. not &lt;header&gt; format).  -I1, -I2 and -I3 indicate the current directory, the source file (i.e. includer) directory, and the both respectively.  For details, see <a href="#4.2"> 4.2</a>.<br>
<br>
<li><samp>-j</samp><br>
On outputting a diagnostic message, <b>mcpp</b> displays only one line of diagnostic without additional information, such as source lines.  (By default, one line of diagnostic message is followed by a source code line having a problem.  If the source code line in question is found in a #included file, all the #including file names and including line numbers are also displayed in sequence.  For a diagnostic on macro, <b>mcpp</b> displays also its definition information).<br>
When Validation Suite is used in the GCC testsuite, this option has to be specified to output a diagnostic message in the same format as GCC.<br>
</ul>
<p>The -M* options are to output source file dependency lines for makefile.  When there are several source files and the -M* option is specified for each of these source files to process and merge the outputs into a file, dependency description lines are aligned.  These options are similar to those of GCC, but there are several differences. *1</p>
<ul>
<li><samp>-M</samp><br>
Output lines that describe dependency among source files.  The output destination is the file specified in a command line, or stdout if omitted.  If a dependency description is too long to fit in a line, it is folded over the next lines.  The preprocessing result is not output.<br>
<br>
<li><samp>-MM</samp><br>
Almost the same with -M, except that the following header files are not output.<br>
<ol>
<li>Files specified in the format of #include &lt;stdio.h&gt;.<br>
<li>Files specified using an absolute path name, such as #include "/usr/include/stdio.h".<br>
<li>Files specified in the format of #include "stdio.h" that are found not in the current or source directory, depending on compiler systems or the -I &lt;n&gt; option, but in system include directories, including those specified with the -I &lt;directory&gt; option or with environment variables.<br>
</ol>
But, GCC-specific-build differs from this, it output the header files excluding only system headers, as GCC does.<br><br>
<li><samp>-MD [FILE]</samp><br>
Almost the same with -M, except that the preprocessing result is output to the specified file on a command line or stdout.  If FILE is specified, <b>mcpp</b> outputs dependency description lines to that file.  Otherwise, they are output to a file having the same base filename with the source file and the suffix of ".d" instead of ".c".<br>
<br>
<li><samp>-MMD [FILE]</samp><br>
Almost the same with -MD, except that, like -MM, the files that are regarded as system header are not output.  An output file <b>mcpp</b> outputs dependency description lines to is same as -MD [FILE].<br>
<br>
<li><samp>-MF FILE</samp><br>
The dependency lines are output to FILE.  -MF FILE takes precedence over -MD FILE or -MMD FILE.<br>
<br>
<li><samp>-MP</samp><br>
"Phony targets" are also output.  Each included file can be written as a phony target without a dependency as follows:<br>
<pre>
test.o: test.c test.h
test.h:
</pre>
<li><samp>-MT TARGET</samp><br>
The target name is specified as TARGET not foo.o.  -MT '$(objpfx)foo.o' outputs the following line.<br>
<pre>
$(objpfx)foo.o: foo.c
</pre>
<li><samp>-MQ TARGET</samp><br>
Same as -MT, except that a string that has a special meaning to 'make' is 'quoted' as follows:<br>
<pre>
$$(objpfx)foo.o: foo.c
</pre>
<li><samp>-N</samp><br>
Disable all the predefined macros, including those that begin with "_", except for the ones required by Standards and <tt>__MCPP</tt>.  The Standard predefined macros include <tt>__FILE__, __LINE__,  __DATE__, __TIME__, __STDC__, __STDC_VERSION__</tt>, as well as <tt>__STDC_HOSTED__</tt> for C99 and <tt>__cplusplus</tt> for C++.  If you want to disable <tt>__MCPP</tt>, use the -U option.<br>
<br>
<li><samp>-o &lt;file&gt;</samp><br>
Output the preprocessed source to the file.  If this option is omitted, the second file argument is regarded as an output path, so this option is not necessary, however, some compiler drivers use this option.<br>
<br>
<li><samp>-P</samp><br>
Do not output line number information for the compiler-proper.  This option is specified when you want to use <b>mcpp</b> for purpose other than C preprocessing.<br>
<br>
<li><samp>-Q</samp><br>
Output diagnostic messages to the "mcpp.err" file in the current directory.  As these messages are appended to this file, it may become bigger.  Delete it from time to time.<br>
<br>
<li><samp>-U &lt;macro&gt;</samp><br>
Undefine predefined macro named "macro".  This option cannot undefine <tt>__FILE__, __LINE__, __DATE__, __TIME__, __STDC__, __STDC_VERSION__</tt> (and <tt>__STDC_HOSTED__</tt> for C99), as well as <tt>__cplusplus</tt> invoked with -+ options.<br>
<br>
<li><samp>-v</samp><br>
Output the <b>mcpp</b> version and a search order of include directories to stderr.<br>
However, when -K option, explained at <a href="#2.4">2.4</a>, is specified or #pragma MCPP macro_call directive, explained at <a href="#3.5.8">3.5.8</a>, is specified, this option changes its meaning.<br>
<br>
<li><samp>-W &lt;level&gt;</samp><br>
Specify a warning level with &lt;level&gt;.  &lt;level&gt; should be 0 or "OR" of any one or more values of 1, 2, 4, 8 and 16.  1, 2, 4, 8 or 16 indicates a warning class.  For example, if -W 5 is specified, warnings of classes 1 and 4 are output.  If 0 is specified, no warnings are output.  If this option is specified several times, all the specified values are "ORed" together.  For example, -W 1 -W 2 -W 4 is equivalent to -W 7.  Instead of -W 7 you can also write as -W "1|2|4". (Enclose with " and " so as | is not interpreted as a pipe.)  If this option is omitted, -W 1 is assumed.  For warning messages, refer from <a href="#5.5"> 5.5</a> through <a href="#5.9"> 5.9</a>.<br>
<br>
<li><samp>-z</samp><br>
The preprocessing results of the #included files are not output, but macros are defined.  The #include lines themselves are output instead, though #include lines in an included file is not output.  This option is used in debug of preprocessing.<br>
</ul>
<p>Note:</p>
<p>*1 <b>mcpp</b> differs from GCC in that:</p>
<ol>
<li><b>mcpp</b> does not provide the -MG option because its option specification is too complicated. (Therefore, I will omit its explanation.)  The -M option can substitute for the -MG option because when include files cannot be found using the -M option, <b>mcpp</b> fails but outputs dependency description lines.<br>
<li><b>mcpp</b>, other than GCC-specific-build, excludes a wider range of header files when using the -MM and -MMD options.<br>
</ol>
<br>

<h2><a name="2.4" href="#toc.2.4">2.4. Options by <b>mcpp</b> Behavioral Modes</a></h2>
<p><b>mcpp</b> has several behavioral modes.  For their specifications refer to sec <a href="#2.1"> 2.1</a>.</p>
<p>This manual shows a list of various <b>mcpp</b> behaviors by mode, which may not be readable.  Please be patient.  In this manual, all the uppercased names that do not begin with "__" and displayed in italics, such as <i>DIGRAPHS_INIT</i>, <i>TRUE</i>, <i>FALSE</i>, etc, are macros defined in system.H.  These macros are only used for compiling <b>mcpp</b> itself and a <b>mcpp</b> executable generated does not predefine these macros.  You must understand this point clearly.</p>
<p>The following options are available in Standard mode:</p>
<ul>
<li><samp>-+</samp><br>
Behave as C++ preprocessor.  <b>mcpp</b> predefines the <tt>__cplusplus</tt> macro (its value is defined in system.H and defaults to 1), interprets the text from // to the end of a logical line as a comment and recognizes <samp>"::", ".*", "-&gt;*"</samp> as a single token.  It evaluates "true" and "false" tokens in a #if expression to 1 and 0, respectively.  If <tt>__STDC__</tt> and <tt>__STDC_VERSION__</tt> are defined, they are undefined. (For GCC-specific-<b>mcpp</b>, <tt>__STDC__</tt> is not undefined for compatibility with GCC.)  The predefined macros that do not begin with "_" are also undefined.  However, extended characters are not converted to UCN. *1, *2<br>
<br>
<li><samp>-2</samp><br>
Reverse initial settings for the digraphs processing.  With <i>DIGRAPHS_INIT</i> == <i>FALSE</i>, <b>mcpp</b> recognizes digraphs.  Otherwise, it doesn't.<br>
<br>
<li><samp>-h &lt;n&gt;</samp><br>
Define the value of <tt>__STDC_HOSTED__</tt> macro with &lt;n&gt;.<br>
<br>
<li><samp>-S &lt;n&gt;</samp><br>
Change the value of <tt>__STDC__</tt> to &lt;n&gt; in C.  In C++, this option is ignored.  The range of &lt;n&gt; has to be 0-9.  With &lt;n&gt; set to 1 or higher, the predefined macros that do not begin with "_", such as <tt>unix</tt>, <tt>linux</tt>, are disabled.  S indicates <tt>__STDC__</tt>.  If this option is omitted, <tt>__STDC__</tt> is set to a default value (i.e. 1).<br>
For a GCC-specific-build, -pedantic, -pedantic-errors, or -lang-c89 is equivalent to -S1, so the next -S is ignored.
This option does not disable the non-conforming predefined macros such as <tt>unix, linux, i386</tt> for compatibility with GCC.
These macros are disabled only by -ansi or -std=iso* options.<br>
<br>
<li><samp>-V &lt;value&gt;</samp><br>
Change the values of the predefined macros <tt>__STDC_VERSION__</tt> for C and <tt>__cplusplus</tt> for C++ to &lt;value&gt;.  &lt;value&gt; is of a long type. (In C95, C99, and C++ Standard, this value is set to 199409L, 199901L and 199711L, respectively.)  With <tt>__STDC__</tt> set to 0, <tt>__STDC_VERSION__</tt> is always set to 0L, overriding the -V option.<br>
<br>
If this option is omitted for C, <tt>__STDC_VERSION__</tt> is set to the value of <i>STDC_VERSION</i> in system.H. (For GCC V.2.7 - V.2.9, 199409L.  For others, 0L.)  If specifying -V199901L results in <tt>__STDC_VERSION__</tt> &gt;= 199901L, <b>mcpp</b> conforms to the following C99 specifications (See <a href="#3.7"> 3.7</a>.):<br>
<br>
<ol>
<li>Treats the text from // to the end of a line as a comment. *3<br>
<li>Allows the sequence of p+, P+, p-, and P-, as well as e+, E+, e-, and E-, in the preprocessing-number.  This is to represent a bit pattern of a floating-point number in Hex, like 0x1.FFFFFEp+128.<br>
<li>Enables the _Pragma operator (A _Pragma( "foo bar") has the same effect as specifying a #pragma foo bar.)<br>
<li><b>mcpp</b> compiled with the <i>EXPAND_PRAGMA</i> macro set to <i>TRUE</i> will macro-expand an argument on a #pragma line that does not begin with STDC or MCPP. (By default, <i>EXPAND_PRAGMA</i> is set to <i>FALSE</i> in other than Visual C-specific-build and Borland C-specific-build, so macro expansion does not occur.)<br>
<li>Allows an escape sequence of Universal-Character-Name (UCN) in identifiers, character constants, string literals and pp-numbers. (This is only enabled in <i>STD</i> mode.)<br>
</ol>
Note that although C99 provides for variable argument macros, <b>mcpp</b> allows them in the C90 and C++ modes. *4<br>
<br>
In C++ also, when specifying -V199901L results in <tt>__cplusplus</tt> &gt;= 199901L, <b>mcpp</b> will enter the C99 compatibility mode, providing the above 2-4 enhancements.  (1 is enabled unconditionally and 5 is almost the same.)  These are <b>mcpp</b>'s own enhancements that do not conform to the C++ Standard.<br>
<br>
The -D option cannot be used with <tt>__STDC__</tt>, <tt>__STDC_VERSION__</tt>, and <tt>__cplusplus</tt>.  This is to distinguish system-defined macros from user-defined ones.<br>
</ul>
<p>The following option is available for <i>STD</i> mode:</p>
<ul>
<li><samp>-3</samp><br>
Reverse initial settings for the trigraphs processing.  With <i>TRIGRAPHS_INIT</i> == <i>FALSE</i>, <b>mcpp</b> recognizes trigraphs.  Otherwise, it does not.<br>
<li><samp>-K</samp><br>
Enable <b>macro notification mode</b> which embeds macro notifications into comments. This mode is designed to allow reconstruction of the original source position from the preprocessed output. The primary purpose of the macro notification mode is to allow C/C++ refactoring tools to refactor source code without having to implement a special-purpose C preprocessor. This mode is also handy for debugging macro expansions. The goal for macro expansion mode is to annotate every macro expansion, while still allowing the code to be compiled. *5<br>
This mode is also enabled by the following pragma:
<pre>
#pragma MCPP debug macro_call
</pre>
The -K option has almost the same effect with this pragma at top of an input file except predefined macros are notified only by this option.
About the specs of macro notification, see <a href="#3.5.8">3.5.8. #pragma MCPP debug macro_call</a>.<br>
This option implies -k option.
The -v option changes its meaning in this mode, and outputs more verbose notations.
On the other hand, -a (-x assembler-with-cpp) or -C options automatically disable -K option. *6<br>
<li><samp>-k</samp><br>
Keep horizontal white spaces ('\t' and space characters) without squeezing them into one space.
Comment is converted to spaces of the same length.
This option is to keep column position of source file in preprocessed output except within macro expansion. (Column position of macros are known by -K option.) *7
</ul>
<p>Note:</p>
<p>*1 C++'s <tt>__STDC__</tt> is not desirable and causes many problems.  GCC document says that <tt>__STDC__</tt> needs to be predefined in C++ because many header files expect <tt>__STDC__</tt> to be defined.  The header files should be blamed for this.  For common parts among C90, C99 and C++, "<samp>#if __STDC__ || __cplusplus</samp>" should be used.</p>
<p>*2 Different from C99, the C++ Standard makes much of UCN.  So did C 1997/11 draft.  Half-hearted implementation is not permitted.  However, implementing Unicode in earnest is too much burden for preprocessor.</p>
<p>*3 In C90 <b>mcpp</b> treats // as a comment but issues a warning.</p>
<p>*4 This is for compatibility with GCC.</p>
<p>*5 If you install GCC-specific-mcpp, cc1 (cc1plus) is set to be handed from <b>mcpp</b> preprocessed file with -fpreprocessed option.
Though this option means that the input is already preprocessed,
cc1 still processes comment.
Therefore, you can safely pass output of -K to cc1 with -fpreprocessed.
Furthermore, if you add -save-temps option to gcc (g++) command, preprocessed output is left as *.i (*.ii) file, and you can read it by some refactoring tool.</p>
<p>*6 Comment insertion by -K option causes column shifts in sources, and this makes *.S file of GCC, which is not C/C++ source and compiled with -x assembler-with-cpp option, unable to be assembled.
Also comments kept by -C option are sometimes confusing with that inserted by -K option.
Therefore these options cannot be used at the same time.</p>
<p>*7 This option fails to keep column position on some particularly complex cases.
When line splicing by a &lt;backslash&gt;&lt;newline&gt; and line splicing by a line-crossing comment are intermingled on one output line, or a comment crosses over 256 lines, column position will be lost.
Note that each '\v' and '\f' is converted to a space with or without this option.</p>
<br>

<h2><a name="2.5" href="#toc.2.5">2.5. Common Options Except for Some Compiler Systems</a></h2>
<p>The following 2 options can be used on UNIX-like systems, for either of compiler-independent-build and GCC-specific-build.
On GCC-specific-build, however, these will get an error if the GCC does not support them.</p>
<ul>
<li><samp>-m32</samp><br>
Predefine macros for 32bit mode.
If the CPU is <samp>x86_64</samp> or <samp>ppc64</samp>, predefined macros for 64bit mode are used by default.
With this option, however, those for <samp>i386</samp> or <samp>ppc</samp> respectively are used.<br>
<li><samp>-m64</samp><br>
Predefine macros for 64bit mode.
If the CPU is <samp>i386</samp> or <samp>ppc</samp>, predefined macros for 32bit mode are used by default.
With this option, however, those for <samp>x86_64</samp> or <samp>ppc64</samp> respectively are used.<br>
</ul>

<p>Since GCC has so many options that GCC-specific-build of <b>mcpp</b> has some different options from the other builds in order to avoid conflicts with GCC.  Note that the options in compiler-independent-build are all the same even if compiled by GCC.  The options common to the builds other than GCC-specific are as follows.</p>
<ul>
<li><samp>-a</samp><br>
Accept the following notations used in some assembler sources without causing an error.<br>
<ol>
<li>
<pre>
#APP
</pre>
<p>If the token that follows the line top # does not agree with any of C directives as above, <b>mcpp</b> outputs this line as it is without causing an error.</p>
<li>
<pre>
# + any comment.
</pre>
<p>If the token that follows the line top # is not even an identifier nor pp-number, <b>mcpp</b> discards the line with a warning, without causing an error.</p>
<li>
<pre>
"A very very
long long
string literal"
</pre>
<p>The above old-fashioned string literals are concatenated into "<samp>A very very\nlong long\nstring literal</samp>".</p>
<li>Even if token concatenation using a ## operator generates an invalid pp-token, it is not regarded as error.<br><br>
<li><b>mcpp</b> does not insert spaces around a macro expansion result, and does not regard an unintended token merging of the macro expansion result with its adjacent token as an error.<br>
</ol>
<p>These sometimes happen to GNU source code, however, this option for GCC is -x assembler-with-cpp or -lang-asm.<br>
This option cannot be used in <i>POSTSTD</i> mode.<br>
This manual calls this mode <b>lang-asm</b> mode.<br>
This mode is recommended when you use <b>mcpp</b> as a macro processor for some text other than C/C++, for example, as a cpp called from xrdb.</p>
<li><samp>-I-</samp><br>
Cancel default include directories and enable only ones specified with an environment variable and the -I option.  Instead of -I-, GCC-specific-build uses -nostdinc.  In GCC, the -I- option provides quite different functionality.  See <a href="#2.6"> 2.6</a>.<br>
</ul>
<br>

<h2><a name="2.6" href="#toc.2.6">2.6. Options by Compiler System</a></h2>
<p>To use <b>mcpp</b> replacing the compiler system's resident preprocessor, install it in the directory where the resident preprocessor is located under an appropriate name.  Before copying <b>mcpp</b>, be sure to change the name of resident preprocessor so that it may not be overwritten.</p>
<p>For settings on Linux, FreeBSD, or CygWIN see <a href="#3.9.5"> 3.9.5</a>.
For settings in GCC 3.*, 4.*, see also <a href="#3.9.7"> 3.9.7</a>, and <a href="#3.9.7.1"> 3.9.7.1</a>.
For MinGW, see <a href="#3.9.7.1"> 3.9.7.1</a>.</p>
<p>Possibly the compiler driver cannot pass some options to <b>mcpp</b> in a normal manner.  However, GCC provides the -Wp almighty option to allow you to pass any options to the preprocessor.  For example, if you specify as follows:</p>
<pre>
gcc -Wp,-W31,-Q23
</pre>
<p>The -W31 and -Q23 options are passed to preprocessor.  The options you want to pass to preprocessor have to be specified following -Wp with each option delimited by ", ". *1, *2</p>
<p>For other compiler systems, if their compiler driver source is available, it is recommended that this type of an almighty option should be added to the source.  If you modify the compiler driver source code in the way that, for example, when -P&lt;opt&gt; is specified, only -&lt;opt&gt; is passed to preprocessor, it would be very convenient because any options can be passed.</p>
<p>An alternative way to use all the options of <b>mcpp</b> is to write a makefile in which first preprocess with <b>mcpp</b>, then compile the output file of <b>mcpp</b> as a source file.  For this method, refer to sections <a href="#2.9"> 2.9</a> and <a href="#2.10"> 2.10</a>.</p>
<p>The following options are available for some compiler-specific-builds.  The compiler-independent-build has not these options, of course.</p>

<p>The following options are available for the LCC-Win32-specific-build.</p>
<ul>
<li><samp>-g &lt;n&gt;</samp><br>
Define the <tt>__LCCDEBUGLEVEL</tt> macro as &lt;n&gt;.<br>
<li><samp>-O</samp><br>
Defines the <tt>__LCCOPTIMLEVEL</tt> macro as 1.<br>
</ul>
<p>The following options are available for the Visual C-specific-build.</p>
<ul>
<li><samp>-arch:SSE, -arch:SSE2</samp><br>
Define the macro <tt>_M_IX86_FP</tt> as 1, 2 respectively.<br>
<li><samp>-Fl &lt;file&gt;</samp><br>
Same as -include &lt;file&gt; for GCC.<br>
<li><samp>-G&lt;n&gt;</samp><br>
If &lt;n&gt; is one of 3, 4, 5, 6, B, define the macro <tt>_M_IX86</tt> as 300, 400, 500, 600, 600, respectively.<br>
<li><samp>-GR</samp><br>
Define the macro <tt>_CPPRTTI</tt> to 1.<br>
<li><samp>-GX</samp><br>
Define the macro <tt>_CPPUNWIND</tt> to 1.<br>
<li><samp>-GZ</samp><br>
Define the macro <tt>__MSVC_RUNTIME_CHECKS</tt> to 1.<br>
<li><samp>-J</samp><br>
Define the macro <tt>_CHAR_UNSIGNED</tt> to 1.<br>
<li><samp>-RTC*</samp><br>
If -RTC1, -RTCc, -RTCs, -RTCu and such option is specified, define the macro <tt>__MSVC_RUNTIME_CHECKS</tt> to 1.<br>
<li><samp>-Tc, -TC</samp><br>
Specify that the source is written in C.  The result is same with or without this option.<br>
<li><samp>-Tp, -TP</samp><br>
Same as -+.<br>
<li><samp>-u</samp><br>
Same as -N.<br>
<li><samp>-Wall</samp><br>
Same as -W17 (-W1 -W16).<br>
<li><samp>-WL</samp><br>
Same as -j.<br>
<li><samp>-w</samp><br>
Same as -W0.<br>
<li><samp>-X</samp><br>
Same as -I-.<br>
<li><samp>-Za</samp><br>
Undefine the macro <tt>_MSC_EXTENSIONS</tt> and prohibit '$' in identifiers.<br>
<li><samp>-Zc:wchar_t</samp><br>
Define the macros <tt>_NATIVE_WCHAR_T_DEFINED</tt> and  <tt>_WCHAR_T_DEFINED</tt> to 1.<br>
<li><samp>-Zl</samp><br>
Define the macro <tt>_VC_NODEFAULTLIB</tt> to 1.<br>
</ul>

<p><b>mcpp</b> on Mac OS X accepts the following option, on both of GCC-specific-build and compiler-independent-build.</p>
<ul>
<li><samp>-F &lt;framework&gt;</samp><br>
Put the &lt;framework&gt; directory to top of the framework directories to search.
The standard framework directories are /System/Library/Frameworks and /Library/Frameworks by default.
</ul>
<p><b>mcpp</b> on Mac OS X accepts the following option on GCC-specific-build.</p>
<ul>
<li><samp>-arch &lt;arch&gt;</samp><br>
Change the target architecture of machine as &lt;arch&gt; from the default one.
This causes changes of some predefined macros.
&lt;arch&gt; should be i386 or x86_64 on the preprocessor for x86, ppc or ppc64 on the one for ppc.
(You can specify any of these 4 for <samp>gcc</samp> command.
<samp>gcc</samp> command invokes the preprocessor for x86 on '<samp>-arch i386</samp>' or '<samp>-arch x86_64</samp>' options, and the one for ppc on '<samp>-arch ppc</samp>' or '<samp>-arch ppc64</samp>' options.)
</ul>
<p>The following options (until at the end of this 2.6 section) are available for the GCC-specific-build.  Note that since <tt>__STDC__</tt> is set to 1 for GCC, the result is same with or without the -S1 option.</p>
<p>The followings are available across the modes.</p>
<ul>
<li><samp>-$</samp><br>
Same as -fno-dollars-in-identifiers.<br>
<li><samp>-b</samp><br>
Output line number information just like C sources.<br>
The format used to pass the line number information from a preprocessor to compiler-proper is usually as follows:<br>
<pre>
#line 123 "filename"
</pre>
Most compiler systems can use this C source format, but some systems cannot.  The default specification of <b>mcpp</b> is such that, in compiler-specific-build for the compiler systems that cannot use the C source format, <b>mcpp</b> outputs the line number information in a format that the compiler-proper can accept it.<br>
However, with this option specified, even in compiler-specific-build for the compiler systems that do not accept the C source format outputs the line number information in that format.  This option is used with '#pragma MCPP preprocess' to pre-preprocess header files.<br>
<li><samp>-dD, -dM</samp><br>
Output valid macro definitions in the form of #define lines at the end of preprocessing.<br>
With the -dD option specified, the preprocessing result is output too.  Predefined macros are not output.<br>
With the -dM option specified, the preprocessing result is not output, and predefined macros are output except the Standard predefined ones. *3, *4<br>
<li><samp>-fexceptions</samp><br>
Define the macro <tt>__EXCEPTIONS</tt> to 1.<br>
<samp>-fno-exceptions</samp> does not define this macro.<br>
<li><samp>-finput-charset=&lt;encoding&gt;</samp><br>
Same as -e &lt;encoding&gt;.  Note that GCC convert the &lt;encoding&gt; to UTF-8 by this option, whereas <b>mcpp</b> does not convert any encoding.<br>
<li><samp>-fno-dollars-in-identifiers</samp><br>
Prohibit '$' in identifiers. (Allow it by default.)<br>
<li><samp>-fPIC, -fpic, -fPIE, -fpie</samp><br>
Any of these options defines both of the macro <tt>__PIC__, __pic__</tt> to 1.
<li><samp>-fstack-protector</samp><br>
Define the macro <tt>__SSP__</tt> to 1.<br>
<li><samp>-fstack-protector-all</samp><br>
Define the macro <tt>__SSP_ALL__</tt> to 2.<br>
<li><samp>-fworking-directory</samp><br>
Emit a special line as the second line of preprocessor's output to convey the current working directory.<br>
<li><samp>-I-</samp><br>
Switch the specification of the -I &lt;directory&gt; before and after this option; directories specified with the -I options before -I- are used to search for header files only in the form of #include "header.h"; the directories specified with -I after -I-, if any, are used to search for all #include directives.  In addition, during the former search, includer's directories are not used.<br>
<li><samp>-include &lt;file&gt;</samp><br>
include the &lt;file&gt; before processing the main source file.  This is equivalent to writing #include &lt;file&gt; at the beginning of the main source file.<br>
<li><samp>-iquote &lt;dir&gt;</samp><br>
Add &lt;dir&gt; to the include path for #include "header.h" form.<br>
<li><samp>-isysroot=&lt;dir&gt;, -isysroot &lt;dir&gt;, --sysroot=&lt;dir&gt;, --sysroot &lt;dir&gt;</samp><br>
Use &lt;dir&gt; as the logical root directory for system headers, that is, prefix &lt;dir&gt; to the path-list of system header directory.
For example, if the default include directory is /usr/include and &lt;dir&gt; is /Developer/SDKs/MacOSX10.4u.sdk, then alter the include directory to /Developer/SDKs/MacOSX10.4u.sdk/usr/include.<br>
<li><samp>-isystem &lt;dir&gt;</samp><br>
Add &lt;dir&gt; to the include path immediately before system-specific directories and immediately after site-specific directories.<br>
<li><samp>-lang-c, -x c</samp><br>
Perform C preprocessing.  The same as not specifying this option at all.<br>
<li><samp>-mmmx</samp><br>
Predefine a macro <tt>__MMX__</tt> to 1.<br>
<samp>-mno-mmx</samp> undefines <tt>__MMX__</tt>.<br>
<li><samp>-nostdinc</samp><br>
Same as -I- for other compiler systems.<br>
<li><samp>-undef</samp><br>
Same as -N.<br>
<li><samp>-O?</samp><br>
If ? is a non-0 digit, define a macro <tt>__OPTIMIZE__</tt> to 1.<br>
<li><samp>-Wcomment, -Wcomments</samp><br>
Same as -W1.  The result is same with or without this option.<br>
<li><samp>-Wtrigraphs</samp><br>
Same as -W16.<br>
<li><samp>-Wall</samp><br>
Same as -W17. (With -Wall, <b>mcpp</b> does not issue class 2 and 4 warnings because these warnings are issued frequently and annoying for Linux or some other system's standard header files.  Class 8 warnings are generally surplus and bothering, but are helpful to confirm portability and etc.  To use this option, be sure to specify gcc -Wp,-W31.)<br>
<li><samp>-w</samp><br>
Same as -W0.<br>
</ul>
<p>The following options are available for Standard mode.</p>
<ul>
<li><samp>-ansi</samp><br>
Define macro <tt>__STRICT_ANSI__</tt> as 1.
Disable non-conforming predefined macros such as <tt>linux, i386</tt>.<br>
Do not remove comma preceding absent variable argument of GCC-spec variadic. *5<br>
<li><samp>-digraphs</samp><br>
Recognize digraphs.  Digraphs specification is also reversed by -2.<br>
<li><samp>-lang-c89, -std=gnu89</samp><br>
Same as -S1.  Not only C90 but also C95 specifications are used.  The result is same with or without this option.<br>
<li><samp>-std=c89, -std=c90</samp><br>
Almost same as -S1, except these imply -ansi.<br>
<li><samp>-lang-c99, -lang-c9x, -std=gnu99, -std=gnu9x</samp><br>
Same as -V199901L.<br>
<li><samp>-std=c99, -std=c9x</samp><br>
Same as -V199901L and also imply -ansi.<br>
<li><samp>-lang-c++, -x c++</samp><br>
Perform C++ preprocessing.  Same as -+.<br>
<li><samp>-std=c++98</samp><br>
Same as -+ and also implies -ansi.<br>
<li><samp>-pedantic, -pedantic-errors</samp><br>
Same as -W7 (i.e. -W1 -W2 -W4).<br>
<li><samp>-std=iso&lt;n&gt;:&lt;ym&gt;</samp><br>
Specify a version of C Standard.  To specify C, &lt;n&gt; is 9899 and C++, 14882.  If &lt;n&gt; is 9899, &lt;ym&gt; is any of 1990, 199409, 1999 and 199901.  If &lt;n&gt; is 14882, &lt;ym&gt; is 199711.  If you enter other value than these in &lt;ym&gt;, <tt>__STDC_VERSION__</tt>  or <tt>__cplusplus</tt> is set to that value.  In this case, &lt;ym&gt; must be specified in six digits, like 200503.<br>
These options imply -ansi.
On the other hand, -std=gnu* do not imply -ansi, also -pedantic does not imply -ansi.<br>
</ul>
<p>For <i>STD</i> mode, following options are available.  (These cannot be used in <i>POSTSTD</i> mode.)</p>
<ul>
<li><samp>-lang-asm, -x assembler-with-cpp</samp><br>
Same as -a for other compiler systems.
Specify lang-asm mode.
In GCC-specific-build, a macro <tt>__ASSEMBLER__</tt> is defined to 1, and '$' in identifiers are prohibited.
When the main source file is named *.S, lang-asm mode is implicitly specified without this option.<br>
<li><samp>-trigraphs</samp><br>
Recognize trigraphs.  Trigraphs specification is also reversed by -3.<br>
</ul>
<p>The following option is available for pre-Standard mode of GCC-specific-build.</p>
<ul>
<li><samp>-traditional, -traditional-cpp</samp><br>
Same as -@old.<br>
</ul>
<p>The next option is available on CygWIN GCC-specific-build.</p>
<ul>
<li><samp>-mno-cygwin</samp><br>
Alter the include directory from /usr/include to /usr/include/mingw, and alter the predefined macros from the ones for cygwin1.dll to the ones for msvcrt.dll.<br>
</ul>
<p><b>mcpp</b> neither makes the following options an error nor does anything about them. (It sometime issues a warning.)</p>
<ul>
<li><samp>-A &lt;predicate(answer)&gt;</samp><br>
<b>mcpp</b> ignores this option.  In GCC, this option is equivalent to writing #assert &lt;predicate (answer)&gt; in the source code.  Standard C, does not permit extension directives other than #pragma.  Fortunately, so far, gcc, by default, passes an equivalent macro with the -D option, so there are no actual problems unless a source program uses #assert, which is a rare case.<br>
<br>
<li><samp>-g &lt;n&gt;</samp><br>
<li><samp>-idirafter &lt;dir&gt;</samp><br>
<li><samp>-iprefix &lt;dir&gt;, -iwithprefix &lt;dir&gt;, -iwithprefixbefore &lt;dir&gt;</samp><br>
<li><samp>-noprecomp</samp><br>
<li><samp>-remap</samp><br>
</ul>
<p>In GCC V.3.3 or later, preprocessor has been absorbed into compiler, and independent preprocessor does not exist.  Moreover, gcc often passes to preprocessor the options not for preprocessor, even if it is invoked with -no-integrated-cpp option.  GCC-specific-build of <b>mcpp</b> for V.3.3 or later ignores the following options, if it cannot recognize them, as that kind of false options.</p>
<ul>
<li><samp>-c</samp><br>
<li><samp>-E</samp><br>
<li><samp>-f*</samp><br>
<li><samp>-m*</samp><br>
<li><samp>-quiet</samp><br>
<li><samp>-W*</samp><br>
</ul>
<p>Note:</p>
<p>*1 -Wa and -Wl are almighty options for assembler and linker, respectively.  The documentation on UNIX/System V/cc describes these options.  Probably, GCC provides the -W&lt;x&gt; option for compatibility.</p>
<p>*2 In GCC V.3, cpp was absorbed into cc1 (cc1plus). Therefore, the options specified with -Wp are normally passed to cc1 (cc1plus).  To have cpp (cpp0), not ccl, preprocess, the -no-integrated-cpp option must be specified on gcc invocation.</p>
<p>*3 GCC V.3.3 or later predefines several dozen of macros.  -dD option does not regard these macros as predefined and output them.</p>
<p>*4 The output of -dM option is similar to that of '#pragma MCPP put_defines' ('#put_defines') with the following differences:</p>
<ol>
<li>'put_defines' outputs also Standard predefined macros as comments.<br>
<li>'put_defines' outputs also the file name and the line number of the macro definition as a comment, arranging to readable format.  On the other hand, -d* options output in the same simple format with GCC, because some makefiles expect that format.<br>
</ol>
<p>*5 Refer <a href="#3.9.6.3">3.9.6.3</a>.</p>
<br>

<h2><a name="2.7" href="#toc.2.7">2.7. Environment Variables</a></h2>
<p>In compiler-independent-build of <b>mcpp</b>, the include directories are not set up other than /usr/include and /usr/local/include in UNIX systems.  Other directories, if required, must be specified using environment variables or runtime options.  The environment variable in compiler-independent-build is INCLUDE for C and CPLUS_INCLUDE for C++.  Searching the file starts from the includer's source directory by default. (refer to <a href="#4.2"> 4.2</a> for the search rule.)  Besides, in Linux there is a confusion of include directories, hence special setup is necessary to cope with this problem.
Refer to <a href="#3.9.9"> 3.9.9</a> for the problem.</p>
<p>For the default include directories on GCC-specific-build, refer to noconfig/*.dif files, and for search rule and environment variable name, refer to <a href="#4.2"> 4.2</a>.</p>
<p>For the environment variable LC_ALL, LC_CTYPE, LANG, refer to <a href="#2.8"> 2.8</a>.</p>
<br>

<h2><a name="2.8" href="#toc.2.8">2.8. Multi-Byte Character Encodings</a></h2>
<p><b>mcpp</b> can process various multi-byte character encodings as follows.</p>
<blockquote>
<table>
  <tr><th>EUC-JP      </th><td>Japanese extended UNIX code (UJIS)</td></tr>
  <tr><th>shift-JIS   </th><td>Japanese MS-Kanji</td></tr>
  <tr><th>GB-2312     </th><td>EUC-like Chinese encoding (Simplified Chinese)</td></tr>
  <tr><th>Big-Five    </th><td>Taiwanese encoding (Traditional Chinese)</td></tr>
  <tr><th>KSC-5601    </th><td>EUC-like Korean encoding (KSX 1001)</td></tr>
  <tr><th>ISO-2022-JP1</th><td>International standard Japanese</td></tr>
  <tr><th>UTF-8       </th><td>A kind of Unicode encoding</td></tr>
</table>
</blockquote>
<p>The encoding used during execution can be specified as follows (Priority is given in this order):</p>
<ol>
<li>The encoding specified in '#pragma __setlocale( "&lt;encoding&gt;")' in source code. (For Visual C-specific-build, '#pragma setlocale( "&lt;encoding&gt;")'.)  This directive allows you to specify several encodings in one source file.<br>
<li>The encoding specified with -e &lt;encoding&gt; or -finput-charset=&lt;encoding&gt; as run-time option.<br>
<li>The encoding specified with the LC_ALL, LC_CTYPE and LANG environment variables.  Priority is given in this order.<br>
<li>The default encoding specified when <b>mcpp</b> is compiled.<br>
</ol>
<p>How to specify a &lt;encoding&gt; is basically same across #pragma __setlocale, -e option, and the environment variables; in the table below, the encoding on the left-side hand is specified by the &lt;encoding&gt; on right-hand side; &lt;encoding&gt; is not case sensitive; '-' and '_' are ignored.  Moreover, if it has '.', the character sequence to the '.' is ignored.  Therefore, EUC_JP, EUC-JP, EUCJP, euc-jp, eucjp and ja_JP.eucJP are regarded as same.  '*' represents any character sequence of zero or more bytes. (iso8859-1, iso8859-2 are equivalent to iso8859*.).</p>
<blockquote>
<table>
  <tr><th>EUC-JP       </th><td>eucjp, euc, ujis</td></tr>
  <tr><th>shift-JIS    </th><td>sjis, shiftjis, mskanji</td></tr>
  <tr><th>GB-2312      </th><td>gb2312, cngb, euccn</td></tr>
  <tr><th>BIG-FIVE     </th><td>bigfive, big5, cnbig5, euctw</td></tr>
  <tr><th>KSC-5601     </th><td>ksc5601, ksx1001, wansung, euckr</td></tr>
  <tr><th>IS0-2022-JP1 </th><td>iso2022jp, iso2022jp1, jis</td></tr>
  <tr><th>UTF-8        </th><td>utf8, utf</td></tr>
  <tr><th>Not specified</th><td>c, en*, latin*, iso8859*</td></tr>
</table>
</blockquote>
<p>If any of the following encodings is specified, <b>mcpp</b> is no longer able to recognize multi-byte characters: C, en* (english), latin* and iso8859*.  When a non-ASCII ISO-8859 Latin-&lt;n&gt; single-byte character set is used, one of these encodings must be specified.  When an empty name is used (#pragma __setlocale( "")), the encoding is restored to the default.</p>
<p>Only in the Visual C-specific-build, the following encoding name can be specified with '#pragma setlocale'.  This is for compatibility with Visual C++.  It is recommended you should use these names because the Visual C++ compiler cannot recognize encoding names other than these.  ('-' can be omitted for <b>mcpp</b>, but not for the Visual C++ compiler-proper.)</p>
<blockquote>
<table>
  <tr><th>shift-JIS    </th><td>japanese, jpn</td></tr>
  <tr><th>GB-2312      </th><td>chinese-simplified, chs</td></tr>
  <tr><th>BIG-FIVE     </th><td>chinese-traditional, cht</td></tr>
  <tr><th>KSC-5601     </th><td>korean, kor</td></tr>
  <tr><th>Not specified</th><td>C, english</td></tr>
</table>
</blockquote>
<p>In Visual C++, the default multi-byte character encoding varies, depending on what language the language parameter and "Region and Language Option" of Windows are set to.  However, the #pragma setlocale specification takes precedence over these Windows's settings.</p>
<p>GCC sometimes fails to handle shift-JIS, ISO2022JP and BIG-FIVE encodings, which contain the byte of 0x5c value.
So, GCC-specific-build of <b>mcpp</b> complements it. *1</p>
<p>Note</p>
<p>*1 If the --enable-c-mbchar option is specified to configure GCC itself, that GCC recognizes an encoding specified by an environmental variable LANG set to one of C-EUCJP, C-SJIS or C-JIS, gcc's info says.
This way of configuring seems to be available from 1998 onward, but it has been seldom used, and its implementation does not work.
Although GCC-specific-build of <b>mcpp</b> had supported these environmental variables, such as LANG=C-SJIS, it removed that feature since V.2.7.<br>
Also GCC info says that, besides LANG, environmental variables LC_ALL and LC_CTYPE can be used to specify an encoding.  However, the difference between using LC_ALL or LC_CTYPE or not lies only in their diagnostic messages, in actual.</p>
<br>

<h2><a name="2.9" href="#toc.2.9">2.9. How to Use <b>mcpp</b> in One-Pass Compilers</a></h2>
<p>Compilers whose preprocessor is integrated into themselves are called one-pass compilers.  These includes Visual C, Borland C, and LCC-Win32.  Such compilers are becoming more popular because they can achieve a little higher processing speed.  However, the time for preprocessing becomes shorter due to better hardware performance.  In the first place, there is much point for preprocessing to be a common phase, mostly independent of run-time environment and compiler systems. It is not desirable that one-pass compilers become more popular.  There will be more compiler-system-specific specifications.</p>
<p>Anyhow, it is impossible to replace the preprocessor of a one-pass compiler with <b>mcpp</b>.  To use <b>mcpp</b>, a source program is preprocessed with <b>mcpp</b> and then the output is passed to a one-pass compiler.  As you see, preprocessing takes place twice.  It is useless but inevitable.  Using of <b>mcpp</b> still has merits of source checking and can avail functions not available in resident preprocessor.</p>
<p>To use <b>mcpp</b> with a one-pass compiler, the procedure must be written in makefile.  For sample procedures, refer to the makefile re-compilation settings used to compile <b>mcpp</b> itself, such as visualc.mak, borlandc.mak, and lcc_w32.mak.</p>
<p>Although GCC 3 or 4 compiler now integrates its preprocessing facility into itself, gcc provides an option to use an external preprocessor.  Use this option when <b>mcpp</b> is used. (See <a href="#3.9.7"> 3.9.7</a>.)</p>
<br>

<h2><a name="2.10" href="#toc.2.10">2.10. How to Use <b>mcpp</b> in IDE</a></h2>
<p>It is difficult to use <b>mcpp</b> in Integrated Development Environment (IDE) because IDE's GUI follows compiler-system-specific specifications and internal interfaces are not usually made available to third parties.  Furthermore, one-pass compilers make it more difficult to insert a phase to use <b>mcpp</b>.</p>
<p>This subsection describes how to make <b>mcpp</b> available in Windows / Visual C++ 2003, 2005, 2008 IDE.  Use the compiler-specific-build for Borland C and LCC-Win32 on command lines.</p>
<p>Also, it is described here how to make <b>mcpp</b> available in Mac OS X / Xcode.app / Apple-GCC.</p>

<h3><a name="2.10.1" href="#toc.2.10.1">2.10.1. How to Make <b>mcpp</b> Available in Visual C++ IDE</a></h3>
<p><b>mcpp</b> cannot be used in a normal "project" since the internal specifications of Visual C++'s IDE are not made available to third parties and the compiler is a one-pass compiler.  However, once a makefile that uses <b>mcpp</b> is created, Visual C++'s IDE can recognize the makefile and you can create a "makefile project" using that file.  This allows you to utilize most of the IDE functions, including source editing, search, and source level debugging.</p>
<p>"Creating a Makefile Project" of a Visual C++ 2003 document, Visual C++ 2005 Express Edition Help and Visual C++ 2008 Express Edition Help describe how to make a makefile project.  Perform the following procedure to create a makefile project.</p>
<ol>
<li>Login as a user with debugging privilege. *1<br>
<li>Write a makefile that specifies <b>mcpp</b>. (Refer to noconfig/visualc.mak.) *2<br>
<li>Start Visual Studio. *3<br>
<li>Click "New Project" to display the "New Project" window.  Select "Makefile Project" and specify "Name" and "Location", and then click "OK".<br>
<li>Then the "Makefile Application Wizard" windows appears.  Click "Application settings", and enter appropriate values in the "Build command line", "Output", "Clean commands", and "Rebuild command line" fields.<br>
Let me explain the appropriate values for these fields by taking an example of making the compiler-independent-build of <b>mcpp</b> itself. (Assuming the name of <b>mcpp</b> executable as mcpp.exe.)<br>
<pre>
"Build command line":   nmake
"Output":               mcpp.exe
"Clean command":        nmake clean
"Rebuild command line": nmake PREPROCESSED=1
</pre>
To make the Visual C-specific-build of <b>mcpp</b>, add an option COMPILER=MSC as:
<pre>
"Build command line":   nmake COMPILER=MSC
"Output":               mcpp.exe
"Clean command":        nmake clean
"Rebuild command line": nmake COMPILER=MSC PREPROCESSED=1
</pre>
Since a Makefile project does not provide a 'make install' equivalent command, you must write the makefile in such a way that the commands you specify in "Build command line" and "Rebuild command line" also perform installation. *4<br>
If you do not compile <b>mcpp</b>, "Build command line" and "Rebuild command line" can be the same.<br>
When completed, click "Finish".<br>
<li>Then the Makefile project appears in "Solution Explorer".  Click the "Source Files" folder, choose "Add Existing Solution Item" from the "Project" menu, select all the source files, and then click "OK".  Then the source file names appear in Solution Explorer.<br>
</ol>
<p>You can now use every functions, including Edit, Build, Rebuild and Debugging.</p>
<p>Note:</p>
<p>*1 On VC 2003 and 2005, to use the debugging function under Windows XP pro or Windows 2000, a user must belong to a group called "Debugger users".  However, Windows XP HE does not provide such a group, so one has to login as an administrator. On VC 2008, such a limitation on users group was lifted.</p>  
<p>*2 In order to perform the source level debugging function, makefile must be written in such a way that cl.exe is called with the -Zi option appended to generate debugging information.</p>
<p>*3 If you start Visual Studio by selecting "Start" -&gt; "Programs", environment variables, such as for include directories, are not set.  In order to have these variables set, you should open the 'Visual Studio command prompt' to start Visual Studio by typing on VC 2003:</p>
<pre>
devenv &lt;Project File&gt; /useenv
</pre>
<p>On VC 2005 express edition and VC 2008 express edition:</p>
<pre>
vcexpress &lt;Project File&gt; /useenv
</pre>
<p>*4 You must have a writing permission to the directory into which you install <b>mcpp</b>.
If you try to install into 'bin' or 'lib' directories of the compiler system, the permission should be carefully set by an administrator account.
It is recommended to make the user account belong to "Power users" or "Authenticated users" group and set "write" and "modify" permissions to the directory for the group.
Another way of controlling the permission is to install the compiler system into a directory which the user has wrinting permission on, such as a shared directory.</p>

<h3><a name="2.10.2" href="#toc.2.10.2">2.10.2. How to Make <b>mcpp</b> available in Mac OS X / Xcode.app</a></h3>
<p>You can use Xcode.app, which is an IDE on Mac OS X, with <b>mcpp</b> without problems. *1</p>
<p>Xcode.app uses gcc (g++) in /Developer/usr/bin rather than /usr/bin for some reason.
(/Developer is the default install directory for Xcode.)
To use <b>mcpp</b> in Xcode.app, you must install GCC-specific-build for the gcc (g++) in that directory.
You should do as follows to install it.
(${mcpp_dir} means the directory where the source of <b>mcpp</b> is placed.)</p>
<pre>
export PATH=/Developer/usr/bin:$PATH
configure ${mcpp_dir}/configure --enable-replace-cpp
make
sudo make install
</pre>
<p>The installation method is the same with that for gcc in /usr/bin, except PATH setting.
So, please refer to INSTALL for installation to cross-compiler or installation of universal binary.</p>
<p>After installing <b>mcpp</b> in such a way, you can use Xcode.app without any special setting for <b>mcpp</b>.
Also the Apple-GCC-specific *.hmap files, which are "header map" files generated by Xcode.app, are read and processed by <b>mcpp</b>.
However, <b>mcpp</b> does not process precompiled-header.
It processes '<samp>#include "header.pch"</samp>' as an ordinary #include.
Also, <b>mcpp</b> does not preprocess Objective-C and Objective-C++, so *.m and *.mm source files are directly handed on to cc1obj and cc1objplus, bypassing <b>mcpp</b>.</p>
<p>When you use <b>mcpp</b>-specific options, specify them as follows:<br>
From screen top menu bar of Xcode.app, select "Project" > "Edit Project Settings".
The "project editor" window will appear.
Then, select "Build" pane of it, and edit "Other C flags" item.
The options should be specified following '-Wp,' and separated by commas, for example:</p>
<pre>
-Wp,-23,-W3
</pre>
<p>Note:</p>
<p>*1 Here we refer to Mac OS X Leopard / Xcode 3.0.</p>
<br>

<h1><a name="3" href="#toc.3">3. Enhancements and Compatibility</a></h1>
<p><b>mcpp</b> has its own enhancements.  Each compiler-system-resident preprocessor has its own enhancements, some of which are not available in <b>mcpp</b>.  This section covers these enhancements and their compatibility problems.</p>
<p>Principally, <b>mcpp</b> outputs #pragma lines as they are.  This principle is applied to the #pragma lines processed by <b>mcpp</b> itself.  This is because the compiler-proper may interpret the same #pragma for itself.</p>
<p>However, <b>mcpp</b> does not outputs the lines beginning with '#pragma MCPP', since these are for <b>mcpp</b> only.  Also, <b>mcpp</b> does not outputs lines of '#pragma GCC' followed by either 'poison', 'dependency' or 'system_header'.  Moreover, <b>mcpp</b> outputs neither of '#pragma once', '#pragma push_macro', nor '#pragma pop_macro' because they are useless on the later phases.
On the other hand, '#pragma GCC visibility *' is outputted, because it is for the compiler and the linker. *1</p>
<p><b>mcpp</b> compiled with <i>EXPAND_PRAGMA</i> == <i>TRUE</i> expands macros in #pragma line (in actual, <i>EXPAND_PRAGMA</i> is set <i>TRUE</i> only for Visual C-specific-build and Borland C-specific-build).  However, #pragma lines followed by STDC, MCPP or GCC are never expanded.</p>
<p>#pragma sub-directives are implementation-defined, hence there are risks of same name sub-directive having different meanings to different compiler-systems.  Some device is necessary to avoid name collision.  Moreover, when <i>EXPAND_PRAGMA</i> == <i>TRUE</i>, there should be a device to avoid the name of #pragma sub-directive itself being macro expanded.  This is why <b>mcpp</b>-specific sub-directives begin with '#pragma MCPP' and are not subject to macro expansion.  This device is adopted from '#pragma STDC' of C99 and '#pragma GCC' of GCC 3.</p>
<p>'#pragma once' is, however, implemented as it is, since this pragma has been implemented in many preprocessors and has now no risk of name collision.  '#pragma __setlocale' is prefixed with "__" instead of MCPP, because it has also meaning for compiler-proper, and because the prefix avoids user-name-space.</p>
<p>Note:</p>
<p>*1 The GCC-specific-build of <b>mcpp</b> only supports '#pragma GCC system_header' of the pragmas starting with GCC.  It does not support '#pragma GCC poison' and '#pragma GCC dependency'.</p>
<br>

<h2><a name="3.1" href="#toc.3.1">3.1. #pragma MCPP put_defines, #pragma MCPP preprocess and others</a></h2>
<p><b>mcpp</b> in Standard mode uses '#pragma MCPP put_defines', '#pragma MCPP preprocess' and '#pragma MCPP preprocessed'.  Pre-Standard mode uses #put_defines, #preprocess and #preprocessed.  Let me explain by taking an example of #pragma.</p>
<p>When <b>mcpp</b> encounters '#pragma MCPP put_defines' directive, it outputs all the macros defined at that time in the form of #define lines.  Of course, the #undef-ed macros are not output.  The macros that cannot be #defined or #undef-ed, such <tt>__STDC__</tt> and etc, are output in the form of #define lines, but are enclosed with comment marks.  (Since <tt>__FILE__</tt> and <tt>__LINE__</tt> are special macros defined dynamically on a macro invocation, the replacement list output here means nothing.)</p>
<p>In pre-Standard mode and <i>POSTSTD</i> mode <b>mcpp</b> do not memorize parameter names of function-like macro definitions.  So, these directives mechanically represent names of the first, second, third parameters as a, b, c, ... and so on.  If it reaches the 27th parameter, it begins with a1, b1, c1, ..., a2, b2, c2, ... and so on.</p>
<p>If you enter the following directive after invoking <b>mcpp</b> from keyboard without specifying input and output files, all the predefined macros are listed.</p>
<pre>
#pragma MCPP put_defines
</pre>
<p>It also outputs a comment to indicate the source file name where each macro definition is found, as well as its line number.  If you invoke <b>mcpp</b> with options such as -S1 or -N, you will see a different set of predefined macros.</p>
<p>When <b>mcpp</b> encounters '#pragma MCPP preprocess' directive, it outputs the following line:</p>
<pre>
#pragma MCPP preprocessed
</pre>
<p>This indicates that the source file has been already preprocessed by <b>mcpp</b>.</p>
<p>When <b>mcpp</b> encounters a '#pragma MCPP preprocessed' directive, it determines that the source file has been preprocessed by <b>mcpp</b> and continues to output the code it reads as it is, until it encounters a #define line.  When <b>mcpp</b> does encounter a #define directive, <b>mcpp</b> determines that the rest of the source file are all #define lines and defines macros.  At this time, <b>mcpp</b> would memorize the source filename and line number in the comment. *1, *2</p>
<p>A '#pragma MCPP preprocessed' is applied only to the lines that follow the directive in the source file where the '#pragma MCPP preprocessed' directive is found.  If the source file is an #included one, when control is returned to the #including file, '#pragma MCPP preprocessed' is no longer applied.</p>
<p>Note:</p>
<p>*1  Actual processing is a little more complex.  When <b>mcpp</b> encounters a '#pragma MCPP preprocessed', <b>mcpp</b> outputs lines it has read just as they are, except for #line lines, which compiler-specific-build of <b>mcpp</b> converts and outputs into a format that the compiler-proper can accept.  <b>mcpp</b> disregards predefined standard macro because its #define line is enclosed with comment marks.</p>
<p>*2  Therefore, information on where a macro definition is found is not lost during pre-preprocessing.</p>

<h3><a name="3.1.1" href="#toc.3.1.1">3.1.1. Pre-preprocessing of Header File</a></h3>
<p>With above directives, you can "pre-preprocess" header files.  Pre-preprocessing considerably saves the entire preprocessing time.  I think the explanation so far has already given you an understanding of how to pre-preprocess header files, but to deepen your understanding, let me explain it by taking an example of <b>mcpp</b>'s own source code.</p>
<p><b>mcpp</b> source consists of eight *.c files, of which seven files include "system.H" and "internal.H".  No other headers are included.  The source looks like this:</p>
<pre>
#if PREPROCESSED
#include  "mcpp.H"
#else
#include  "system.H"
#include  "internal.H"
#endif
</pre>
<p>The system.H includes noconfig.H or configed.H, as well as several standard header files.  mcpp.H is not a source file I provide and is a "pre-preprocessed" header file I am going to generate.</p>
<p>To generate mcpp.H (of course, after setting up noconfig.H and other headers), invoke <b>mcpp</b> as follows:</p>
<pre>
mcpp &gt; mcpp.H
</pre>
<p>For compiler systems, such as GCC, also specify the -b option.</p>
<p>Enter the following directives from the keyboard:</p>
<pre>
#pragma MCPP preprocess
#include "system.H"
#include "internal.H"
#pragma MCPP put_defines
</pre>
<p>Enter "end-of-file" to terminate <b>mcpp</b>.</p>
<p>This has accomplished mcpp.H, which consists of the preprocessed system.H and internal.H and a set of #define lines following them.  Including mcpp.H gives the same effect as including system.H and internal.H, but its size is one-nth of the original header files including the standard ones. This is because #if and comments are eliminated.  It takes far less time to include mcpp.H in seven *.c files than to include system.H and internal.H seven times.  By using '#pragma MCPP preprocess', much more time can be saved.</p>
<p>On compilation, use the -DPREPROCESSED=1 option.</p>
<p>It is recommended that the above procedure should be written in a file and the makefile should refer to it. The makefile and preproc.c appended to <b>mcpp</b> sources contain the procedure.  Please refer to it.</p>
<p>Although the usage of independent preprocessor is limited for one-pass compilers like Visual C, Borland C or LCC-Win32, the pre-preprocessing facility is useful even for those.</p>
<p>The pre-preprocessing facility of header files is similar to that of the -dD option of GCC, but it differs from it in that:</p>
<ol>
<li>GCC outputs line number information not in the form of '#line 123 "filename"', but in the form of '# 123 "filename"', which allows GCC to reprocess the information, but the Standard C preprocessor cannot.<br>
<li>GCC/cpp of older version outputs a #define line whenever it encounters it, but does not output a #undef line.  Therefore, reprocessing the preprocessed result may produce a different result from what the original source intends.<br>
<li>By using '#pragma MCPP preprocess', which is not provided by GCC, <b>mcpp</b> can provides a higher processing speed.<br>
</ol>
<p>As far as the pre-preprocessing facility is concerned, <b>mcpp</b> is more accurate and practical than GCC.</p>
<br>

<h2><a name="3.2" href="#toc.3.2">3.2. #pragma once</a></h2>
<p>#pragma once directive is available in Standard mode.</p>
<p>#pragma once is also available for GCC, Visual C, LCC-Win32 and compiler-independent preprocessor called Wave.</p>
<p>This directive is used when you want to include a header file only once.  With the following directive in a header file, <b>mcpp</b> includes the header file only once even if a #include line for that file appears many times.</p>
<pre>
#pragma once
</pre>
<p>Usually, compiler-system-specific standard header files prevent duplicate definitions by using the following code:</p>
<pre>
#ifndef __STDIO_H
#define __STDIO_H
/* Contents of stdio.h */
#endif
</pre>
<p>#pragma once provides similar functionality to this.  Using macros always involves reading a header file. (The preprocessor cannot skip reading the code as people do and must read the entire header file for #if's or #endif's; It must read a comment before it can determine whether a line is a directive line, that is, a line with # at the beginning followed by a preprocessing directive; To do so, the preprocessor must identify a string literal; After all, it must read through the entire header file and perform most of tokenization.)  #pragma once eliminates the need of even accessing to a header file, resulting in a improved processing speed for multiple includes.</p>
<p>To determine whether two header files are identical, file name characters, including directory names in a search path, are compared.  Windows is not case sensitive.  Therefore, "/DIR1/header.h" and "/DIR2/header.h" are regarded as distinct, but "header.h" and "HEADER.H" are regarded as the same on Windows, but distinct on UNIX-like systems.  A directory is memorized after converting to absolute path, and a symbolic link in UNIX systems is memorized after dereferencing.  Moreover, path-list is normalized by removing redundant part such as <samp>"foo/../"</samp>.  So, the identical files are determined always correctly. *1, *2, *3</p>
<p>I borrowed the idea of #pragma once from GCC V.1.*/cpp.  GCC V.2.*, and V.3.* still has this functionality but it is regarded as obsolete.  The specification of GCC V.2.*/cpp has been changed as follows: If the entire header file is enclosed with #ifndef _MACRO, #define _MACRO, and #endif, the cpp memorizes it and inclusion occurs only once, even without #pragma once.</p>
<p>However, this GCC V.2 and V.3 specification sometimes does not work for commercially available compiler systems that are not based on the GCC specification, due to a difference in the standard header file notation.  In addition, the GCC V.2 and V.3 specification is more complex to implement.  For this reason, I decided to implement only #pragma once.</p>
<p>As with other preprocessors, it is not advisable to rely only on #pragma once when the same header files are used.  It is recommended that #pragma once should be combined with macros as follows:</p>
<pre>
#ifndef __STDIO_H
#define __STDIO_H
#pragma once
/* Contents of stdio.h */
#endif
</pre>
<p>Note that #pragma once must not be written in &lt;assert.h&gt;.  For the reason, see <a href="cpp-test.html#5.1.2"> cpp-test.html#5.1.2</a>.  The same thing can be said with &lt;cassert&gt; and &lt;cassert.h&gt; of C++.</p>
<p>Another problem is that the recent GCC/GLIBC system has header files, like &lt;stddef.h&gt;, which are repeatedly #included by other system headers.  They define  macros, such as <tt>__need_NULL</tt>, <tt>__need_size_t</tt>, and <tt>__need_ptrdiff_t</tt>, and then #include &lt;stddef.h&gt;.  Each time they do so, definitions such as <tt>NULL</tt>, size_t, and ptrdiff_t are defined in the &lt;stddef.h&gt;.  The same thing can be said with &lt;errno.h&gt; and &lt;signal.h&gt;, and even with &lt;stdio.h&gt;.  Other system headers define macros, such as <tt>__need_FILE, __need___FILE</tt>, and then #include &lt;stdio.h&gt;.  Each time they do so, definitions such as FILE may be defined in &lt;stdio.h&gt;.  #pragma once can not be used in such header files. *4</p>
<p>Note:</p>
<p>*1 The normalized result can be seen by <samp>'#pragma MCPP debug path'</samp>.  See <a href="#3.5.1">3.5.1</a>.
<samp>'#pragma MCPP put_defines'</samp> and diagnostics use the same result, too.<br>
However, the path-list is not normalized usually in #line line.
By default, the #line line is output as specified by #include line, prepending the normalized include path, if any.
But, if -K option is specified, it is normalized so as to be easily utilized by some other tools.</p>
<p>*2 On CygWIN, /bin and /usr/bin are the same directory in real, also /lib and /usr/lib are the same, and supposing / is C:/dir/cygwin on Windows, /cygdrive/c/dir/cygwin is the same as /.  <b>mcpp</b> treats these directories as the same, converting the path-list to the format of /cygdrive/c/dir/cygwin/dir-list/file.</p>
<p>*3 On MinGW, / and /usr are the same directory in real.  Supposing / is C:/dir/msys/1.0, /c/dir/msys/1.0 is the same as /, and supposing /mingw is C:/dir/mingw, /c/dir/mingw is the same with /mingw.  <b>mcpp</b> treats each of these as the same directories, converting the path-list to the format of c:/dir/msys/1.0/dir-list/file or c:/dir/mingw/dir-list/file.</p>
<p>*4 This is applied at least to Linux/GCC 2.9x, 3.* and 4.*/glibc 2.1, 2.2 and 2.3.  FreeBSD 4, 5, 6 has much simpler system headers because it does not use glibc.</p>

<h3><a name="3.2.1" href="#toc.3.2.1">3.2.1. Tool to Write #pragma once to Header Files</a></h3>
<p>With a small number of header files, writing #pragma once to them does not require much effort, but it would be tremendous work if there are many header files.  I wrote a simple tool to write it automatically to header files.</p>
<p>tool/ins_once.c is a tool written for old versions of GCC.  As Borland 5.5 conform to the same standard header file notation with GCC, this tool can be used.  However, it is advisable that this tool should not be used in the systems like Glibc 2 that has many exceptions shown above.</p>
<p>Even in the compiler systems that can use the tool, some header files do not strictly conform to the GCC notation.  GCC's read-once functionality also does not work properly for these header files.</p>
<p>Compile ins_once.c and perform the following command in a directory, such as /usr/include or /usr/local/include, under UNIX.</p>
<pre>
chmod -R u+w *
</pre>
<p>and then execute ins_once as follows:</p>
<pre>
ins_once -t *.h */*.h */*/*.h
</pre>
<p>Ins_once reports header files that do not begin with #ifndef or #if !defined.  Manually modify these files.  Then, execute ins_once as follows:</p>
<pre>
ins_once *.h */*.h */*/*.h
</pre>
<p>If the first directive in each header file is #ifndef or #if !defined, a #pragma once line is inserted immediate below the line.  Only a root user or a user with an appropriate permission is eligible for this modification.  When you modified access permission, use 'chmod -R u-w *' to restore to original permission.</p>
<p>Ins_once provides the following options.  Select the most appropriate one for your system.</p>
<ul>
<li>-t:  Check whether a header file begins with #ifndef or #if !defined, excluding a comment.  This option does not modify the file.<br>
<li>-p:  Insert a #pragma once line at the beginning of file.  By default, this line is inserted immediate below the #ifndef or #if !defined line.<br>
<li>-g:  For GCC system, &lt;stddef.h&gt;, &lt;stdio.h&gt;, &lt;signal.h&gt;, &lt;errno.h&gt; are also excluded.  By default, only &lt;assert.h&gt;, &lt;cassert&gt; and &lt;cassert.h&gt; are excluded.<br>
</ul>
<p>ins_once roughly checks to write a #pragma once line only once in the same header file even if it is executed several times, but the check is not very strict.  As this ins_once is of temporary and tentative nature, it scarcely performs tokenization.  It worked as I expected with FreeBSD 2.0 and 2.2.7, Borland C 5.5, but it may not work properly for special header files.  So before executing this tool, be sure to make a backup of an original file.</p>
<p>Have the shell expand a wild-card. (In case of buffer overflow, execute ins_once several times by specifying some of your system header files.)</p>
<br>

<h2><a name="3.3" href="#toc.3.3">3.3. #pragma MCPP warning, #include_next, #warning</a></h2>
<p>These directives are provided for compatibility with GCC.  GCC provides the #include_next and #warning directives.  Although these directives are non-conforming, not only some source programs sometimes use them but also some Glibc2 system header files do.  Taking this situation into consideration, I implemented the #include_next and #warning directives in GCC-specific-build to allow compilation of such source programs, however, <b>mcpp</b> issues a warning when it finds the directives.  Regardless of the compiler systems <b>mcpp</b> is ported to, <b>mcpp</b> in Standard mode also implements #pragma MCPP warning.</p>
<p>With the following directive, <b>mcpp</b> skips the current file's directory and start searching header.h from the next directory of search path.</p>
<pre>
#include_next  &lt;header.h&gt;
</pre>
<p>CygWIN and MinGW ignores the distinctions of alphabetical case of header names.</p>
<p>The following code outputs 'any message' to stderr as a warning message:</p>
<pre>
#pragma MCPP warning    any message

#warning  any message
</pre>
<p>Different from #error, this is not counted as an error.</p>
<br>

<h2><a name="3.4" href="#toc.3.4">3.4. #pragma MCPP push_macro, #pragma __setlocale and others</a></h2>
<p>When I ported <b>mcpp</b> to Visual C, I implemented these directives in <b>mcpp</b>, and then made them available for other systems.</p>
<p>'#pragma MCPP push_macro( "MACRO")' and '#pragma MCPP pop_macro( "MACRO")' are used to "push" or "pop" a macro definition (MACRO) to the current macro definition stack.</p>
<p>'#pragma push_macro( "MACRO")' and '#pragma pop_macro( "MACRO")' are also available for Visual C.</p>
<p>push_macro saves a macro definition to the stack, and pop_macro retrieves the macro definition.  The pushed macro definition remains valid after push_macro.  To invalidate it, use #undef or redefine the macro with a new definition.  push_macro can be used multiple times for a same name macro.</p>
<p>'#pragma __setlocale( "&lt;encoding&gt;")' changes the current multi-byte character encoding to &lt;encoding&gt;.  The argument of setlocale must be a string literal.  For &lt;encoding&gt;, refer to <a href="#2.8"> 2.8</a>.  This directive allows you to use several encodings in one translation unit.</p>
<p>In Visual C++, '#pragma __setlocale' cannot be used.  Use '#pragma setlocale' instead.  Encoding specification must be conveyed not only to <b>mcpp</b> but also to the compiler-proper.  The latter can recognize only #pragma setlocale.  For other compiler systems, when the compiler-proper cannot recognize an encoding, <b>mcpp</b> complements it.</p>
<p>There is not yet any compiler-proper which can recognize '#pragma __setlocale'.</p>
<br>

<h2><a name="3.5" href="#toc.3.5">3.5. #pragma MCPP debug, #pragma MCPP end_debug, #debug, #end_debug</a></h2>
<p>'#pragma MCPP debug' and '#pragma MCPP end_debug' are for Standard mode.  #debug and #end_debug are for pre-Standard mode.</p>
<p>The '#pragma MCPP debug &lt;args&gt;' directive can be written anywhere in a source program.  &lt;args&gt; specifies a debug information type.  One #pragma MCPP debug directive can take several &lt;arg&gt;.  One or more &lt;arg&gt; must be specified for each directive.  <b>mcpp</b> begins to output debug information when it finds this directive, and stops it when it encounters '#pragma MCPP end_debug &lt;args&gt;'.  The &lt;args&gt; can be omitted, in which case all types of debug information is reset.  If &lt;args&gt; contains an argument that is not supported by <b>mcpp</b>, <b>mcpp</b> issues a warning, but all the preceding arguments are regarded as valid.</p>
<p>All the debug information is output to the same path with the preprocessing output to synchronize with it.  Therefore, this directive usually prevents compilation.
Nevertheless, <samp>#pragma MCPP debug macro_call</samp> outputs informations embedding into comments, and can be re-preprocessed and compiled.</p>
<p>When you noticed something was wrong with the preprocessing result, enclose the coding you want to debug with the following directives, for example:</p>
<pre>
#pragma MCPP debug token expand
/* Coding you want to debug  */
#pragma MCPP end_debug
</pre>
<p>As this directive was originally used for debugging <b>mcpp</b> itself, it was not developed with end users in mind.  So, you may not understand its behavior unless you read its source code, and you may sometimes feel it outputs too much information, but it is useful for tracing the preprocessing process.  Be patient.</p>
<p>The following debug information types can be specified with &lt;arg&gt;.</p>
<blockquote>
<table>
  <tr><th>path      </th><td>Displays the include file search path.</td></tr>
  <tr><th>token     </th><td>Parses tokens one by one and displays its type.</td></tr>
  <tr><th>expand    </th><td>Traces a macro expansion process.</td></tr>
  <tr><th>macro_call</th><td>Embed macro notifications into comments on each macro definition and macro expansion.</td></tr>
  <tr><th>if        </th><td>Displays the result (true or false) of #if, #elif, #ifdef and #ifndef.</td></tr>
  <tr><th>expression</th><td>Traces #if expression evaluation.</td></tr>
  <tr><th>getc      </th><td>Traces preprocess 1-byte by 1-byte.</td></tr>
  <tr><th>memory    </th><td>Displays the status of heap memory used by <b>mcpp</b>.</td></tr>
</table>
</blockquote>

<h3><a name="3.5.1" href="#toc.3.5.1">3.5.1. #pragma MCPP debug path, #debug path</a></h3>
<p>With these directives, <b>mcpp</b> displays include directories in the search path (excluding the current and source directories with which search begins) in the order of priority, starting with the highest one first.</p>
<p>In addition, with a #include directive, <b>mcpp</b> displays all the directories, including the current one, it actually searched for the #include file.<br>
When a header file with #pragma once specified is #included again, the message to that effect is displayed.<br>
Moreover, <b>mcpp</b> normalizes the path-list removing the redundant part such as <samp>"foo/../"</samp>, and displays the result when the normalized path-list differs from the original one.<br>
Also <b>mcpp</b> dereferences the symbolic link to its linked-file, and displays the result when conversion is occurred.</p>

<h3><a name="3.5.2" href="#toc.3.5.2">3.5.2. #pragma MCPP debug token, #debug token</a></h3>
<p>With these directives, <b>mcpp</b> displays a source line it has read, and then displays a token and its type on the source line each time it has read.  This token, more specifically, is a preprocessing-token (pp-token).  Not only pp-tokens on a source line but also ones <b>mcpp</b> reads again internally during macro expansion are displayed repeatedly.</p>
<p>However, the following 1-byte tokens are not displayed for <b>mcpp</b> program's convenience sake:</p>
<ol>
<li>1. '#' at the beginning of a preprocessing directive line.<br>
<li>2. '(' at the beginning of a parameter list of a function-like macro definition.<br>
<li>3. ',' delimiting between function-like macro definition parameters.<br>
<li>4. '(' at the beginning of an argument list used for a function-like macro invocation.<br>
</ol>
<p>A pp-token has the following types:</p>
<blockquote>
<table>
  <tr><th>NAM </th><td>Identifier</td></tr>
  <tr><th>NUM </th><td>Preprocessing-number</td></tr>
  <tr><th>OPE </th><td>Operator or punctuator</td></tr>
  <tr><th>STR </th><td>String literal</td></tr>
  <tr><th>WSTR</th><td>Wide string literal</td></tr>
  <tr><th>CHR </th><td>Character constant</td></tr>
  <tr><th>WCHR</th><td>Wide character constant</td></tr>
  <tr><th>SPE </th><td>Special pp-tokens, such as $ and @</td></tr>
  <tr><th>SEP </th><td>Token separator white space</td></tr>
</table>
</blockquote>
<p>Of SEP, other than &lt;newline&gt; are not normally displayed.  Control codes such as &lt;newline&gt; are displayed as &lt;^J&gt; or &lt;^M&gt;.</p>

<h3><a name="3.5.3" href="#toc.3.5.3">3.5.3. #pragma MCPP debug expand, #debug expand</a></h3>
<p>With these directives, <b>mcpp</b> traces the expansion process of a macro invocation.  When <b>mcpp</b> in Standard mode encounters a #pragma MCPP debug, it behaves as follows:</p>
<p>If there is a macro invocation, <b>mcpp</b> displays the macro definition.  Each argument is read, the argument is substituted for the corresponding parameter in the replacement list and the replacement list is rescanned.  <b>mcpp</b> displays this whole process.  In case of nested macro definitions, they are rescanned and expanded one by one.  If an argument has a macro, <b>mcpp</b> traces the above process recursively before parameter substitution.</p>
<p>Each time control is passed to and returned from a certain set of <b>mcpp</b> internal functions, <b>mcpp</b> displays the trace information along with the function name.  The following table shows the role of these functions.  Reading <b>mcpp</b> source code will gives you a concrete idea on what each function is doing.</p>
<blockquote>
<table>
  <tr><th>expand_macro</th><td>Entrance routine for macro expansion</td></tr>
  <tr><th>replace     </th><td>Expands a macro one level down.</td></tr>
  <tr><th>collect_args</th><td>Collects arguments.</td></tr>
  <tr><th>prescan     </th><td>Scans a replacement list and processes # and ## operator.</td></tr>
  <tr><th>substitute  </th><td>Substitutes parameters with arguments.</td></tr>
  <tr><th>rescan      </th><td>Rescans a replacement list.</td></tr>
</table>
</blockquote>
<p>Except for expand_macro, above functions are indirectly recursive with each other.</p>
<p>For replace and collect_args, <b>mcpp</b> displays data it internally stacks during macro expansion.  This data is displayed using the following internal codes:</p>
<blockquote>
<table>
  <tr><th>&lt;n&gt;     </th><td>Nth parameter</td></tr>
  <tr><th>&lt;TSEP&gt;  </th><td>Token delimiter inserted by <b>mcpp</b></td></tr>
  <tr><th>&lt;MAGIC&gt; </th><td>Code that inhibits re-replacement of the macro of the same name</td></tr>
  <tr><th>&lt;RT_END&gt;</th><td>Code that indicates the end of a replacement list</td></tr>
  <tr><th>&lt;SRC&gt;   </th><td>Code that indicates an identifier taken from source file while rescanning</td></tr>
</table>
</blockquote>
<p>&lt;SRC&gt; is used only in <i>STD</i> mode, and is not used in <i>POSTSTD</i> mode nor in <i>COMPAT</i> mode.</p>
<p>It is recommended that '<samp>#pragma MCPP debug token</samp>' should be also used.</p>
<p>If you specify also '<samp>#pragma MCPP debug macro_call</samp>' or -K option, macro notifications are output embedded in comments.
However, in replace() and its subordinate routines some magic characters (internal codes) are written or removed in the input stream instead of comment.
These magic characters are displayed as:</p>
<blockquote>
<table>
  <tr><th>&lt;MACm&gt;     </th><td>Call of the m'th macro contained in one macro call</td></tr>
  <tr><th>&lt;MAC_END&gt;  </th><td>The end of the macro call started by previous MACm</td></tr>
  <tr><th>&lt;MACm:ARGn&gt;</th><td>The n'th argument of the m'th macro call</td></tr>
  <tr><th>&lt;ARG_END&gt;  </th><td>The end of the argument started by previous MACm:ARGn</td></tr>
</table>
</blockquote>
<p>If you specify -v option too, the MAC_END and the ARG_END markers also display the same numbers with corresponding starting markers.</p>
<p>For #debug expand, <b>mcpp</b> uses internal routines considerably different from those used for Standard mode.  The explanations are omitted.</p>

<h3><a name="3.5.4" href="#toc.3.5.4">3.5.4. #pragma MCPP debug if, #debug if</a></h3>
<p>With these directives, <b>mcpp</b> displays #if, #elif, #ifdef and #ifndef lines and reports their evaluation result (true or false).  As for a skipped #if section, no report is made.</p>

<h3><a name="3.5.5" href="#toc.3.5.5">3.5.5. #pragma MCPP debug expression, #debug expression</a></h3>
<p>With these directives, <b>mcpp</b> traces evaluation of a #if or #elif expression.  DECUS cpp, based on which <b>mcpp</b> has been developed, provides these directives for the purpose of debugging cpp itself.  I scarcely modified them.  This directive outputs a very long list of internal functions, as well as variable names and their values.  Unless you read the <b>mcpp</b> source code, you may not understand these variables.  However, without the source code, you can manage to understand how the <b>mcpp</b> pushes onto and takes out of a evaluation stack a complex expression value.</p>

<h3><a name="3.5.6" href="#toc.3.5.6">3.5.6. #pragma MCPP debug getc, #debug getc</a></h3>
<p>With these directives, <b>mcpp</b> outputs detailed data each time it calls get_ch(), a function to read one byte.  When <b>mcpp</b> in Standard mode scans a pp-token, this routine is called to read only the first byte of the pp-token.</p>
<p>With a #debug getc, <b>mcpp</b> calls this routine during token scan, resulting in a tremendous amount of data output.</p>
<p>In any way, using these directives outputs a huge amount of data, so you scarcely need to use them.</p>

<h3><a name="3.5.7" href="#toc.3.5.7">3.5.7. #pragma MCPP debug memory, #debug memory</a></h3>
<p>With these directives, <b>mcpp</b> reports the status of the heap memory it has internally allocated or released using malloc(), realloc() or free() only once.  Only the kmmalloc I developed and some other types of malloc() provide this functionality.  Refer to <a href="mcpp-porting.html#4.extra"> mcpp-porting.html#4.extra</a>.  In case of other malloc(), <b>mcpp</b> will neither cause an error nor report a status.</p>
<p><b>mcpp</b> reports the heap memory status again when it terminates with these directives on.  The same thing happens when <b>mcpp</b> terminates due to out of memory.</p>

<h3><a name="3.5.8" href="#toc.3.5.8">3.5.8. #pragma MCPP debug macro_call</a></h3>
<p>With this directive, <b>mcpp</b> starts <b>macro notification mode</b>.
In this mode, on each macro definition and macro expansion, its line and column information on source file are output embedded in comments.
On a macro call with arguments, location information on each arguments are reported too.
Token concatenation by macro, however, may cause loss of macro information about the tokens before concatenation.<br>
In addition, some informations are output on #undef, #if (#elif, #ifdef, #ifndef) and #endif, too.<br>
This mode is specified also by -K option.</p>
<p>Macro notification mode is designed to allow reconstruction of the original source position from the preprocessed output. The primary purpose of this mode is to allow C/C++ refactoring tools to refactor source code without having to implement a special-purpose C preprocessor. This mode is also handy for debugging macro expansions. *1</p>
<p>The goal for macro expansion mode is to annotate every macro expansion, while still allowing the output to be compiled.
On the other hand, '<samp>#pragma MCPP debug expand</samp>' is to trace macro expansion and outputs detailed informations, but its output is not compilable.</p>
<p>Note:</p>
<p>*1 Most of the specifications of macro notification mode were proposed by Taras Glek.
He is working on refactoring of sources at mozilla project:</p>
<p><a href="http://blog.mozilla.com/tglek/"> http://blog.mozilla.com/tglek/</a></p>

<h4><a name="3.5.8.1">3.5.8.1. Comment on #define</a></h4>
<p>For example, macro definition directives such as:</p>
<pre>
#define NULL 0L
#define BAR(x, y) x ## y
#define BAZ(a, b) a + b
</pre>
<p>produce the following output:</p>
<pre>
/*mNULL 1:9-1:16*/
/*mBAR 2:9-2:25*/
/*mBAZ 3:9-3:24*/
</pre>
<p>where the format means:</p>
<samp>/*m[NAME] [start line]:[start column]-[end line]:[end column]*/</samp>
<p>Line and column numbers start from 1.
When you specify -K option, predefined macros are output too, which have no location information.</p>

<h4><a name="3.5.8.2">3.5.8.2. Comment on #undef</a></h4>
<pre>
#undef  BAZ
</pre>
<p>This line produces the output:
<pre>
/*undef 10*//*BAZ*/
</pre>
<p>The [lnum] and [NAME] in the format of /*undef [lnum]*//*[NAME]*/ indicate line number of the line and the undefined MACRO name.</p>

<h4><a name="3.5.8.3">3.5.8.3. Comment on Macro Expansion</a></h4>
<p>Within source code other than directives, macros are expanded with markers indicating start and stop of the macro expansion.
The format allows for HTML-like nesting. <samp>/*<...*/</samp> signals start of macro expansion and <samp>/*>*/</samp> the end.
The start of macro expansion takes the following format replacing <samp>/*m</samp> of start of macro definition format to <samp>/*<</samp>:</p>
<samp>/*<[NAME] [start line]:[start column]-[end line]:[end column]*/</samp>
<p>On macro with arguments, markers indicating source location of each argument and markers indicating start and end of each argument expansion are output too.
The marker for argument location takes the format of <samp>/*!...*/</samp>.
When a macro is found in an argument, informations on that macro is output recursively, with its location information if it is on the source file.
Macro argument marker also have a disambiguating naming scheme. An argument name is of the format:</p>
<samp>[func-like-macro-name]:[nesting level]-[argument number]</samp>
<p>This way, if someone calls '<samp>BAZ(BAZ(a,b), c)</samp>', it would possible to distinguish nested macros of the same name and their arguments from each other.
The argument number starts from 0.
Then the location format follows it as:</p>
<samp>[start line]:[start column]-[end line]:[end column]</samp>
<p>The marker for start of an argument also takes the format:</p>
<samp>/*<[func-like-macro-name]:[nesting level]-[argument number]*/</samp>
<p>The marker for end of an argument is the same with the one for end of a macro expansion: <samp>/*>*/</samp>.</p>
<p>The following lines:</p>
<pre>
foo(NULL);
foo(BAR(some_, var));
foo = BAZ(NULL, 2);
bar = BAZ(BAZ(a,b), c);
</pre>
<p>expand to:</p>
<pre>
foo(/*&lt;NULL 4:5-4:9*/0L/*&gt;*/);
foo(/*&lt;BAR 5:5-5:20*//*!BAR:0-0 5:9-5:14*//*!BAR:0-1 5:16-5:19*/some_var/*&gt;*/);
foo = /*&lt;BAZ 6:7-6:19*//*!BAZ:0-0 6:11-6:15*//*!BAZ:0-1 6:17-6:18*//*&lt;BAZ:0-0*//*&lt;NULL 6:11-6:15*/0L/*&gt;*//*&gt;*/ + /*&lt;BAZ:0-1*/2/*&gt;*//*&gt;*/;
bar = /*&lt;BAZ 7:7-7:23*//*!BAZ:0-0 7:11-7:19*//*!BAZ:0-1 7:21-7:22*//*&lt;BAZ:0-0*//*&lt;BAZ 7:11-7:19*//*!BAZ:1-0*//*!BAZ:1-1*//*&lt;BAZ:1-0*/a/*&gt;*/ + /*&lt;BAZ:1-1*/b/*&gt;*//*&gt;*//*&gt;*/ + /*&lt;BAZ:0-1*/c/*&gt;*//*&gt;*/;
</pre>
<p>Moreover, when -v option is specified along with -K option, the marker for end of macro expansion and the marker for end of an argument expansion also output the macro name and its number same with their starting markers as:</p>
<pre>
foo(/*&lt;NULL 4:5-4:9*/0L/*NULL&gt;*/);
foo(/*&lt;BAR 5:5-5:20*//*!BAR:0-0 5:9-5:14*//*!BAR:0-1 5:16-5:19*/some_var/*BAR&gt;*/);
foo = /*&lt;BAZ 6:7-6:19*//*!BAZ:0-0 6:11-6:15*//*!BAZ:0-1 6:17-6:18*//*&lt;BAZ:0-0*//*&lt;NULL 6:11-6:15*/0L/*NULL&gt;*//*BAZ:0-0&gt;*/ + /*&lt;BAZ:0-1*/2/*BAZ:0-1&gt;*//*BAZ&gt;*/;
bar = /*&lt;BAZ 7:7-7:23*//*!BAZ:0-0 7:11-7:19*//*!BAZ:0-1 7:21-7:22*//*&lt;BAZ:0-0*//*&lt;BAZ 7:11-7:19*//*!BAZ:1-0*//*!BAZ:1-1*//*&lt;BAZ:1-0*/a/*BAZ:1-0&gt;*/ + /*&lt;BAZ:1-1*/b/*BAZ:1-1&gt;*//*BAZ&gt;*//*BAZ:0-0&gt;*/ + /*&lt;BAZ:0-1*/c/*BAZ:0-1&gt;*//*BAZ&gt;*/;
</pre>
<p>As you see in this example, all the ending markers correspond to the last preceding starting markers of the same nesting level.
Hence, you can judge their correspondence automatically even without -v option.</p>

<h4><a name="3.5.8.4">3.5.8.4. Comment on #if (#elif, #ifdef, #ifndef)</a></h4>
<p>On #if (#elif, #ifdef, #ifndef) line, informations on the macros in the line are shown.</p>
<p>For example, this is bar.h:</p>
<pre>
#define NULL 0L
#define BAR(x, y) x ## y
#define BAZ(a, b) a + b
</pre>
<p>And here is foo.c:</p>
<pre>
#include "bar.h"
#ifdef  BAR
#ifndef BAZ
#if 1 + BAR( 2, 3)
#endif
#else
#if 1
#endif
#if BAZ( 1, BAR( 2, 3))
#undef  BAZ
#endif
#endif
#endif
</pre>
<p>Then, foo.c produces the following output:</p>
<pre>
#line 1 "/dir/foo.c"
#line 1 "/dir/bar.h"
/*mNULL 1:9-1:16*/
/*mBAR 2:9-2:25*/
/*mBAZ 3:9-3:24*/
#line 2 "/dir/foo.c"
/*ifdef 2*//*BAR*//*i T*/
/*ifndef 3*//*BAZ*//*i F*/


/*else 6:T*/
/*if 7*//*i T*/
/*endif 8*/
/*if 9*//*BAZ*//*BAR*//*i T*/
/*undef 10*//*BAZ*/
#line 11 "/dir/foo.c"
/*endif 11*/
/*endif 12*/
/*endif 13*/
</pre>
<p>As you see, on #if line, an annotation starts with a /*if [lnum]*/ format where [lnum] indicates the current line number.
Then one or more /*[NAME]*/ markers follow, if some macros are found, a /*[NAME]*/ for each macro.
The annotation terminates with /*i T*/ or /*i F*/ which indicates that the directive is evaluated true or false, respectively.
The expansion result is not displayed, unlike a macro on lines other than directives.
On a line such as '<samp>#if 1</samp>' which has no macro, no /*[NAME]*/ is displayed.</p>
<p>Also annotations on <samp>#elif, #ifdef</samp> and <samp>#ifndef</samp> start with /*elif [lnum]*/, /*ifdef [lnum]*/ and /*ifndef [lnum]*/, respectively, followed by /*[NAME]*/, if some macros are found, and terminate with /*i T*/ or /*i F*/.</p>
<p>In any blocks where compilation is to be skipped, no annotation is displayed.</p>

<h4><a name="3.5.8.5">3.5.8.5. Comment on #else and #endif</a></h4>
<p>On #else line, as the above examples, an information is displayed in /*else [lnum]:[C]*/ format where [lnum] is the current line number and [C] is 'T' or 'F' which indicates that the #else - #endif block is to be compiled or to be skipped.</p>
<p>On #endif line, as the above examples, an information is displayed in /*endif [lnum]*/ format where [lnum] indicates the current line number.
Of course, the #endif corresponds to the last #if (#ifdef, #ifndef) which is not yet closed.</p>

<h4><a name="3.5.8.6">3.5.8.6. #line Output</a></h4>
<p>In addition, on the macro notification mode, the output format of filename in #line line differs from the default.
It outputs the filename in the "normalized" full-path-list.
See <a href="#3.2">3.2</a>.
This is for the convenience of refactoring tool making.</p>
<br>

<h2><a name="3.6" href="#toc.3.6">3.6. #assert, #asm, #endasm</a></h2>
<p>#assert is available in pre-Standard mode, except the GCC-specific-build.  #assert provides the functionality equivalent to the #error directive in the Standard C.  The following code in the Standard C:</p>
<pre>
#if ULONG_MAX/2 &lt; LONG_MAX
#error Bad unsigned long handling.
#endif
</pre>
<p>can be expressed as:</p>
<pre>
#assert LONG_MAX &lt;= ULONG_MAX/2
</pre>
<p>The argument of #assert is evaluated as a #if expression.  If it evaluates to true (non-zero), <b>mcpp</b> does nothing and if false (0), it displays the following message and then the argument line (after processing line splicing and comments):</p>
<pre>
Preprocessing assertion failed
</pre>
<p><b>mcpp</b> counts this as error but continues processing.</p>
<p>This #assert is quite different from that of System V or GCC.</p>
<p><b>mcpp</b> in pre-Standard mode regards a block enclosed with the #asm and #endasm directives as assembler coding.  <b>mcpp</b> implements this functionality for Microware C/6809 only.  To implement this functionality in other compiler systems, do_old() and put_asm() in system.c must be modified.</p>
<p>For a #asm block, <b>mcpp</b> performs trigraphs conversion and deletes &lt;backslash&gt;&lt;newline&gt; sequence, but it neither performs comment processing, checks tokens or characters, nor deletes white-space characters at the beginning of a line.  Also, it does not expand a token that happens to have the same name with a macro and outputs it as it is.  Other directive lines have no meaning within the #asm block.</p>
<p>These #asm and #endasm directives do not conform to Standard C.  In the first place, extension directives in the form other than "#pragma sub-directive" are not Standard C conforming.  Changing their directive names to #pragma asm and #pragma endasm does not solve this problem.  In Standard C, the source code must consist of a C token sequence (more precisely, a preprocessing token sequence), however, an assembler program is not a C token sequence.  To use assembly code in the Standard C, there is no other way but to embed it in a string literal token.  Then, you have to implement a built-in function that processes that string literal in the compiler-proper and call it as follows:</p>
<pre>
asm (
    " leax _iob+13,y\n"
    " pshs x\n"
);
</pre>
<p>However, this is not suitable for a longer assembly code, in which case, you had better write the assembly code as a separate file like a library function, and assemble and link the program.  This seems to be inconvenient, but it is necessary to separate the assembler portion completely to write a portable C program.  It is recommended that you should write assembly code in a separate file rather than using #asm.</p>
<br>

<h2><a name="3.7" href="#toc.3.7">3.7. New C99 Features (_Pragma() operator, Variadic Macro and others)</a></h2>
<p>These features are available in Standard mode.  The -V199901L option with <tt>__STDC_VERSION__</tt> set to 199901L enables the following C99's features.  The same thing can be said with C++ for the -V199901L option with <tt>__cplusplus</tt> set to 199901L or more.  Although C++ Standard does not provides for the features other than 1 or 7, <b>mcpp</b> in Standard mode provides them for better compatibility with C99.  Standard mode also allows variable argument macros even in the C90 and C++ modes. *1</p>
<ol>
<li>Treats the text from // to the end of a line as a comment.<br>
<br>
<li>Enables variable argument macros.<br>
<br>
<li>Allows the sequence of p+, P+, p-, and P-, as well as e+, E+, e-, and E-, in the preprocessing-number.  This is to represent a bit pattern of a floating-point number in Hex, like 0x1.FFFFFEp+128.<br>
<br>
<li>Enables the _Pragma() operator.<br>
<br>
<li><b>mcpp</b> compiled with the <i>EXPAND_PRAGMA</i> macro set to <i>TRUE</i> macro-expands the argument of #pragma line that do not begin with STDC, MCPP nor GCC.  (By default, <b>mcpp</b> is compiled with <i>EXPAND_PRAGMA</i> == <i>FALSE</i>, so it is not subject to macro expansion.  It is macro expanded only in Visual C-specific-build and Borland C-specific-build.)<br>
<br>
<li>For compiler-systems with long long, a #if expression is evaluated in long long or unsigned long long.<br>
<br>
<li>Allows an escape sequence named UCN for Unicode in the forms of \unnnn and \Unnnnnnnn in identifiers, character constants, string literals and pp-numbers.  The value of a UCN in #if expression is evaluated as a hexadecimal representation. (UCN cannot be used in <i>POSTSTD</i> mode.)<br>
</ol>
A variable argument macro takes a form of:<br>
<pre>
#define debug(...)  fprintf(stderr, __VA_ARGS__)
</pre>
<p>Here is a macro invocation:</p>
<pre>
debug( "X = %d\n", x);
</pre>
<p>This macro is expanded as follows:</p>
<pre>
fprintf(stderr, "X = %d\n", x);
</pre>
<p><samp>...</samp> in the parameter list corresponds to one or more parameters.  In the above example, <samp>...</samp> corresponds to <samp>__VA_ARGS__</samp> in the replacement list.  During a macro invocation, several arguments that correspond to the <samp>...</samp>, including ",", are concatenated to be treated as one argument.</p>
<p>_Pragma( "foo bar") has the same effect as specifying #pragma foo bar.  The argument of the _Pragma() operator must be one string literal or wide string literal.  For a wide string, the prefix (L) is deleted and treated as same as a string literal.  For a string literal, " enclosing that string literal is deleted, and \" and \\ in that literal is replaced with " and \, respectively, before it is treated as a #pragma argument.</p>
<p>#pragma must be written somewhere in one logical line and its argument is not macro-expanded at least for C90.  On the other hand, the _Pragma() operator can be written anywhere in source code (even in a replacement list), which gives the same effect with #pragma written in a logical line.  The _Pragma() operator generated during macro expansion is also valid.  This flexibility provides the pragma directive with a wide range of portability and allows a header file to absorb the difference in #pragma among compiler systems. (For this sample, see pragmas.h and pragmas.t of "Validation Suite".) *2</p>
<p>C99 stipulates a #if expression is of maximum integer type.  As "long long" and "unsigned long long" are required types, the type of an #if expression is "long long / unsigned long long" or larger.  C90 and C++98 stipulate the type is long / unsigned long.  <b>mcpp</b>, however, evaluates it by long long / unsigned long long even in C90 and C++98, and issues a warning when the value is out of range of long / unsigned long. *1</p>
<p>Note:</p>
<p>*1 This is for compatibility with GCC and Visual C++ 2005, 2008.  It is difficult also for other compiler systems to implement C99 specifications all at once.  Probably, they will begin to implement them little by little with <tt>__STDC_VERSION__</tt> set to 199409L or so.</p>
<p>*2 C99 says that a #pragma argument that begins with STDC is not macro-expanded.  For other #pragma arguments, whether macro is expanded is implementation-defined.</p>
<br>

<h2><a name="3.8" href="#toc.3.8">3.8. Particular specifications for certain compiler systems</a></h2>
<p><b>mcpp</b> of compiler-specific-builds have some specifications peculiar to each compiler system.
Such particular specifications, other than execution option and #pragma, are explained in this section.</p>

<h3><a name="3.8.1" href="#toc.3.8.1">3.8.1. Variadic macro of GCC and Visual C</a></h3>
<p>GCC has variadic macro of its own specification from V.2 as shown in <a href="#3.9.1.6">3.9.1.6</a>.
In this manual we call this as GCC2-spec variadic macro.
Moreover, GCC implemented one more spec of variadic from V.3 as shown in <a href="#3.9.6.3">3.9.6.3</a>, which we call GCC3-spec variadic macro.
GCC V.2.95 and later implements C99 variadic, too.
Nevertheless, softwares such as glibc or Linux system headers do not use C99 variadic, nor even GCC3-spec one, and still use that of GCC2-spec.</p>
<p><b>mcpp</b> of GCC-specific-build on <i>STD</i> mode implemented GCC3-spec variadic from V.2.6.3, and GCC2-spec one from V.2.7, in order to avoid inconveniences on Linux or some other softwares.
Yet, <b>mcpp</b> warns at use of them.
GCC-spec variadics, especially GCC2-spec one, are not only unportable, but also syntactically unclean.
Use of them in sources to write in future is not recommendable.</p>
<p>Visual C had not implemented variadic macro up to 2003.
But, Visual C 2005 finally implemented it.
Its specification is C99 one with a modification like GCC3-spec one.
When variable arguments are absent, Visual C removes their immediately preceding comma.
It does not use '##' token used in GCC3-spec.
The specification is illustrated as below.
Visual C document says that in the third example, a comma is not removed.
In actual, however, it removes even the comma in this case.</p>
<pre>
#define EMPTY
#define VC_VA( format, ...)     printf( format, __VA_ARGS__)
VC_VA( "var_args: %s %d\n", "Hello", 2005);     /* printf( "var_args: %s %d\n", "Hello", 2005); */
VC_VA( "absence of var_args:\n");               /* printf( "absence of var_args:\n");   */
VC_VA( "empty var_args:\n", EMPTY);             /* printf( "empty var_args:\n", );  */  /* trailing comma   */
</pre>
<p>Visual C-specific-build of <b>mcpp</b> implemented in <i>STD</i> mode this special handling from V.2.7.
Still, it warns at use of the spec.
<b>mcpp</b> implements the behavior on the above third example as its spec, and does not remove the comma.</p>

<h3><a name="3.8.2" href="#toc.3.8.2">3.8.2. Handling of 'defined' by GCC</a></h3>
<p>On a macro with 'defined' token in it, GCC expands it differently from usual when it is in #if line.
I explain this problem at <a href="#3.9.4.6">3.9.4.6</a>.</p>
<p><b>mcpp</b> of GCC-specific-build from V.2.7 onward handles this sort of macro like GCC in <i>STD</i> mode.
This behavior was implemented to cope with a few wrong macros used in Linux system headers.
Yet, it warns at use of 'defined' token in macro on #if line.
You should write correct #if expression.</p>

<h3><a name="3.8.3" href="#toc.3.8.3">3.8.3. Asm Statement in Borland C and Other Special Syntaxes</a></h3>
<p>Borland C has the <samp>asm</samp> keyword.  This keyword is used to write assembly code as follows:</p>
<pre>
asm {
    mov x,4;
    ...;
}
</pre>
<p>This is quite irregular and deviates from the C grammar.  If there happen to be a token with the same name as a macro, it will be macro-expanded.  The same can be said with Borland C itself and <b>mcpp</b>.  It is recommended that an assembler program should be written in a separate <samp>.asm</samp> file.
<b>mcpp</b> does not treat this specially.</p>
<p>Visual C++ also has the <samp>__asm</samp> keyword, which provides the similar functionality to this.</p>
<p>GCC provide a Standard-conforming built-in function <samp>asm()</samp> which is used as <samp>asm( " mov x,4\n")</samp>.</p>

<h3><a name="3.8.4" href="#toc.3.8.4">3.8.4. #import and Others</a></h3>

<h4>3.8.4.1. #import of Mac OS X / GCC</h4>
<p>GCC has <samp>#import</samp> directive, which is a feature of Objective-C and imported to C/C++.
<samp>#import</samp> is a <samp>#include</samp> with an implicit '<samp>#pragma once</samp>'.
It is occasionally used in C/C++ sources on Mac OS X.</p>
<p><b>mcpp</b> V.2.7 and later implements this directive only on Mac OS X, in both of GCC-specific-build and compiler-independent-build.</p>

<h4>3.8.4.2. #import and #using of Visual C</h4>
<p>Visual C has peculiar directives named <samp>#import</samp> and <samp>#using</samp>, which have form of preprocessing-directive, in fact, they are directives for compiler and linker.
The <samp>#import</samp> of Visual C has no relation to that of GCC.</p>
<p><b>mcpp</b> of Visual-C-specific-build outputs these lines as they are.</p>
<br>

<h2><a name="3.9" href="#toc.3.9">3.9. Problems of GCC and Compatibility with GCC</a></h2>
<p>Although I tried to develop <b>mcpp</b> in such manner that the GCC-specific-build provides compatibility with GCC / cpp (cc1) to the extent that it does not hinder practical use, it is still incompatible in many aspects.</p>
<p>First of all, as shown in Chapter 2, there are many differences in execution options.  <b>mcpp</b> implements neither -A option nor non-conforming directives, including #assert and #ident. *1</p>
<p>Fortunately, there seems to be quite few sources that cannot be compiled due to a lack of this compatibility.</p>
<p>It is more problematic that there are some sources that assume special behaviors of old preprocessors.  Most of such source code receives a warning when -pedantic is specified in GCC.  <b>mcpp</b> in Standard mode, by default, provides almost the same behavior as GCC's -pedantic since it implements Standard conforming error checking.  However, since GCC/cpp, by default, allows such Standard violations without issuing a diagnostic, there are some sources that take advantage of this.</p>
<p>It is very easy to rewrite such non-conforming code to Standard-conforming code in most cases, so it is meaningless to take the trouble to write non-conforming code only to impair portability and, what is worse, to provide a hotbed of bugs.  When you find such code, do not hesitate to correct it. *2</p>
<p>Note:</p>
<p>*1 The functionality of #assert and #ident should be implemented using #pragma, if necessary.  The same can be said with #include_next and #warning, but these directives seem to be sometimes used in GCC system, so I grudgingly implemented them in GCC-specific-build, however, a warning is issued when they are used.</p>
<p>*2 From 3.9 through 3.9.3 sections of this document was written in 1998, when the sources depending on traditional preprocessors were frequently found.
After that time, such sources have greatly decreased, 
and on the other hand, sources depending on the local features and implementation trivialities of GCC have much increased.
The 3.9.4 and later, especially 3.9.8 and later sections describe mainly such problems. (2008/03)</p>

<h3><a name="3.9.1" href="#toc.3.9.1">3.9.1. Preprocessing FreeBSD 2/Kernel Source</a></h3>
<p>Taking FreeBSD 2.2.2-R (1997/05) kernel source code as an example, this section explains some preprocessing problems.  All the directories that appear in this section are installed in /sys (/usr/src/sys).  Of the items I point out below, 3.9.1.7 and 3.9.1.8 are not necessarily Standard violations and work as expected in <b>mcpp</b>, but <b>mcpp</b> issues a warning because their coding is confusing.  3.9.1.6 is an enhancement and C99 provides the same functionality, but it differs from GNUC/cpp in notation.</p>

<h4>3.9.1.1. Multi-Line String Literal</h4>
<p>Assembly codes are embedded by the following manner in <samp>i386/apm/apm.c, i386/isa/npx.c, i386/isa/seagate.c, i386/scsi/aic7xxx.h, dev/aic7xxx/aic7xxx_asm.c,  dev/aic7xxx/symbol.c, gnu/ext2fs/i386- bitops.h, pc98/pc98/npx.c</samp>:</p>
<pre>
asm("
    asm code0
#ifdef PC98
    asm code1
#else
    asm code2
#endif
    ...
");
</pre>
<p>When no " closing a string literal appears by the end of line, GCC/cpp, by default, interprets that the string literal ends at the end of line.  The above coding is based on this specification.  In addition, the compiler-proper seems to interpret the whole content of asm() as a string literal spreading across lines.</p>
<p>I think that assembler source code should be written in a separate file, but if you want to embed it in ".c" file by all means, write it in the following manner, instead of using the confusing coding shown above.</p>
<pre>
asm(
    "  asm code0\n"
#ifdef PC98
    "  asm code1\n"
#else
    "  asm code2\n"
#endif
    "  ...\n"
);
</pre>
<p>Standard C conforming preprocessors will accept it.</p>

<h4>3.9.1.2. #else junk, #endif junk</h4>
<p>The following line appears in <samp>ddb/db_run.c, netatalk/at.h, netatalk/aarp.c, net/if-ethersubr.c, i386/isa/isa.h, i386/isa/wdreg.h, i386/isa/tw.c, i386/isa/b004.c, i386/isa/matcd/matcd.c, i386/isa/sound/sound_calls.h, i386/isa/pcvt/pcvt_drv.c, pci/meteor.c, and pc98/pc98/pc98.h</samp>:</p>
<pre>
#endif MACRO
</pre>
<p>This line should be changed to:</p>
<pre>
#endif /* MACRO */
</pre>

<h4>3.9.1.3. #ifdef 0</h4>
<p>To my surprise, <samp>i386/apm/apm.c</samp> contains the following strange line:</p>
<pre>
#ifdef 0
</pre>
<p>Of course, this should be written as:</p>
<pre>
#if 0
</pre>
<p>This code must have been neither debugged nor used.</p>

<h4>3.9.1.4. Duplicate Definition of Macro</h4>
<p><samp>gnu/i386/isa/dgb.c</samp> has a duplicate definition of the following macro:</p>
<pre>
#define DEBUG
</pre>
<p>Some of header files have a macro definition conflicting with this.</p>
<p>The Standard C regards duplicate definitions as "violation of constraint", but how they are treated depends on compiler systems; some make the first definition valid after issuing an error message and others, like GCC 2/cpp, make the last definition valid without issuing any message by default.  To make the last definition valid, the following code should be added immediately before the last definition.</p>
<pre>
#undef DEBUG
</pre>

<h4>3.9.1.5. #warning</h4>
<p><samp>i386/isa/if_ze.c, and i386/isa/if_zp.c</samp> have the #warning directive.  This is the only Standard violation directive I found in the kernel source.  To conform to the Standard C, there is no way but to comment out this line.</p>
<p><b>mcpp</b> accepts #warning.</p>

<h4><a name="3.9.1.6">3.9.1.6. Variable Argument Macros</a></h4>
<p><samp>gnu/ext2fs/ext2_fs.h and i386/isa/mcd.c</samp> have the following macro that takes variable number of arguments:</p>
<pre>
#define MCD_TRACE(fmt, a...)        \
{                                   \
    if (mcd_data[unit].debug) {     \
        printf("mcd%d: status=0x%02x: ",    \
            unit, mcd_data[unit].status);   \
        printf(fmt, ## a);          \
    }                               \
}

#   define ext2_debug(fmt, a...)  { \
        printf("EXT2-fs DEBUG (%s, %d): %s:",   \
            __FILE__, __LINE__, __FUNCTION__);  \
        printf(fmt, ## a);          \
        }
</pre>
<p>This is a GCC-specific enhanced specification and cannot be applied to other compiler systems.  The above "## a" can be simply written as "a".  With ## and in the absence of an argument corresponding to "<samp>a...</samp>" in a macro invocation, the preceding comma is deleted.  C99 also provides for variable argument macros, but their notation differs from that of GCC.  The above example is written as follows in C99:</p>
<pre>
#define MCD_TRACE( ...)             \
{                                   \
    if (mcd_data[unit].debug) {     \
        printf("mcd%d: status=0x%02x: ",    \
            unit, mcd_data[unit].status);   \
        printf( __VA_ARGS__);       \
    }                               \
}

#  define ext2_debug( ...)     {    \
       printf("EXT2-fs DEBUG (%s, %d): %s:",   \
           __FILE__, __LINE__, __FUNCTION__);  \
       printf( __VA_ARGS__);   \
       }
</pre>
<p>The most annoying difference is that in C99 requires one or more arguments on a macro invocation corresponding to "<samp>...</samp>" while GNUC/cpp requires 0 or more arguments corresponding to "<samp>a...</samp>".  To handle this, when there is no argument corresponding to "<samp>...</samp>", <b>mcpp</b> issues a warning, instead of making it an error.  Therefore, you can change the above code as follows:</p>
<pre>
#define MCD_TRACE(fmt, ...)         \
{                                   \
    if (mcd_data[unit].debug) {     \
        printf("mcd%d: status=0x%02x: ",    \
            unit, mcd_data[unit].status);   \
        printf(fmt, __VA_ARGS__);   \
    }                               \
}

#  define ext2_debug(fmt, ...)     {    \
       printf("EXT2-fs DEBUG (%s, %d): %s:",   \
           __FILE__, __LINE__, __FUNCTION__);  \
       printf(fmt, __VA_ARGS__);   \
       }
</pre>
<p>This is simpler with one-to-one correspondence.  However, this way of writing has a disadvantage that a comma immediately before an empty argument remains, resulting in, for example, printf( fmt, ).  In this case, there is no other way but to write a macro definition in accordance with C99 specifications, or avoid using an empty argument in a macro invocation.  Harmless tokens, such as NULL or 0, are used to write, for example, <samp>MCD_TRACE(fmt, NULL)</samp>. *1</p>
<p>Note:</p>
<p>*1 <p>GCC 2.95.3 or later also implements variable argument macros based on the C99 syntax.  It is recommended to use this syntax.  GCC specific one provides the flexibility of allowing for zero number of variable argument, but its notation is bad in that (1) for the "<samp>args...</samp>" parameter, a white space must not be inserted between "<samp>args</samp>" and "<samp>...</samp>", but such a pp-token is not permitted in C/C++, and that (2) it is not desirable that the notation for a token concatenation operator is used for different meaning in a replacement list.  It is desirable to allow zero number of variable arguments based on the C99 notation.  GCC 3 introduced a notation for variable argument macros that is a mixture of GCC 2's traditional notation and C99 one.  For details, refer to <a href="#3.9.6.3">3.9.6.3</a>.</p>

<h4>3.9.1.7. Empty Argument in Macro Call</h4>
<p>The following macro invocations appear in <samp>nfs/nfs.h, nfs/nfsmount.h, nfs/nfsmode.h, netinet/if_ether.c, netinet/in.c, sys/proc.h, sys/socketvars.h, i386/scsi/aic7xxx.h, i386/include/pmap.h, dev/aic7xxx/scan.l, dev/aic7xxx/aic7xxx_asm.c, kern/vfs_cache.c, pci/wd82371.c, vm/vm_object.h, and vm/device/pager.c</samp>.  So do in <samp>/usr/include/nfs/nfs.h</samp>.</p>
<pre>
LIST_HEAD(, arg2)
TAILQ_HEAD(, arg2)
CIRCLEQ_HEAD(, arg2)
SLIST_HEAD(, arg2)
STAILQ_HAED(, arg2)
</pre>
<p>The first argument is empty.  C99 approved empty arguments but C90 regarded them as undefined.  Taking it consideration that an argument may happen to be empty during a nested macro invocation, empty arguments should be approved, however, it is neither necessary nor desirable to write an empty argument in source code.  Note that for a one-argument macro, there is syntax ambiguity between an empty argument and a lack of argument.</p>
<p>Taking everything into consideration, the following notation is recommended:</p>
<pre>
#define EMPTY

LIST_HEAD(EMPTY, arg2)
TAILQ_HEAD(EMPTY, arg2)
CIRCLEQ_HEAD(EMPTY, arg2)
SLIST_HEAD(EMPTY, arg2)
STAILQ_HAED(EMPTY, arg2)
</pre>
<p>Any Standard C conforming preprocessor will accept this notation.</p>
<p>By the way, some of the header files (in the nfs directory) shown above neither have the macro definitions shown above nor #include any other header files.  This is because such header files assume that these macro definitions exist in sys/queue.h and that *.c programs will #include sys/queue.h first.  These files arise ambiguity.</p>
<p><samp>kern/kern_mib.c</samp> has the following macro definitions:</p>
<pre>
SYSCTL_NODE(, arg2, arg3, arg4, arg5, arg6, arg7, arg8, arg9)
</pre>
<p>In this case, the first argument cannot be changed to <tt>EMPTY</tt>.  Because the corresponding macro definition in the <samp>sys/sysctl.h</samp> is as follows:</p>
<pre>
#define SYSCTL_NODE(parent, nbr, name, access, handler, descr)      \
    extern struct linker_set sysctl_##parent##_##name;              \
    SYSCTL_OID(parent, nbr, name, CTLTYPE_NODE|access,              \
        (void*)&amp;sysctl_##parent##_##name, 0, handler, "N", descr);  \
    TEXT_SET(sysctl_##parent##_##name, sysctl__##parent##_##name);
</pre>
<p>In other words, these arguments are not macro-expanded.  The arguments of the <tt>SYSCTL_OID</tt> macro shown above, including the first one, are not macro expanded.  In this case, there is no way but to leave the empty argument as it is.  *1</p>
<p>Note:</p>
<p>*1 C99 approves empty arguments as legitimate.  Taking macros, such as <tt>SYSCTL_NODE</tt>() and <tt>SYSCTL_OID</tt>(), into consideration, the <tt>EMPTY</tt> macro is not almighty and using empty arguments has some reason.  In addition, even if <tt>EMPTY</tt> is used, a nested macro invocation may cause empty arguments.  However, for source readability, using <tt>EMPTY</tt> is recommended whenever possible.</p>

<h4>3.9.1.8. Object-Like Macros Replaced with Function-like Macro Name</h4>
<p><samp>i386/include/endian.h</samp>, as well as <samp>/usr/include/machine/endian.h</samp>, has the following macro definitions. (There are four same kinds of definitions.)</p>
<pre>
#define __byte_swap_long(x) (replacement text)
#define NTOHL(x)            (x) = ntohl ((u_long)x)
#define ntohl               __byte_swap_long
</pre>
<p>The problem is the <samp>ntohl</samp> definition.  Although <samp>ntohl</samp> is an object-like macro, it is expanded to a function-like macro name, then rescanned with subsequent text, and is expanded as if it were a function-like macro.  This way of macro-expansion has been regarded as an implicit specification since K&amp;R 1st, and the Standard C somehow approved it as legitimate.  However, as I discuss in other documents, it is this specification that makes macro-expansion unnecessarily complicated and brings confusion to Standard documents.  This is a bug specification. *1</p>
<p>This <samp>ntohl</samp> is actually a function-like macro, written as an object-like macro omitting the parameter list.  You had better define this like a function-like macro that it is:</p>
<pre>
#define ntohl(x)    __byte_swap_long(x)
</pre>
<p>This causes no problem.</p>
<p><samp>i386/isa/sound/os.h</samp> has the same kind of macro definitions:</p>
<pre>
#define INB         inb
#define INW         inb
</pre>
<p>This should be written as follows:</p>
<pre>
#define INB(x)      inb(x)
#define INW(x)      inb(x)
</pre>
<p>Note:</p>
<p>*1 ISO 9899:1990 Corrigendum 1:1994 regarded the notation as undefined.  C99 replaced this article with other.  However, Standard documents are still confusing about this.  For details, see <a href="cpp-test.html#2.7.6"> cpp-test.html#2.7.6</a>.</p>

<h4>3.9.1.9. Preprocessing .S File</h4>
<p>Some kernel sources are contained in several ".S" files, that is, they are written in assembler.  These sources contain #include's or #ifdef's, which require preprocessing.  To preprocess them, in FreeBSD 2.2.2-R, 'cc' is called with the '-x assembler-with-cpp' option, and 'cc' calls '/usr/libexec/cpp' with the '-lang-asm' option and then calls 'as'.</p>
<p>Of course, this ways of using .S files is non-conforming.  This assembler source code must not contain a token that happens to have the same name with a macro.  White spaces between tokens and at the beginning of a line must be retained during preprocessing.  In addition, if the first token at the beginning of a line is a # indicating an assembler comment, special processing is required on the preprocessor side.  This not only considerably limits available preprocessors but also increases the possibility of unknowingly introducing bugs.  So, using .S files in this way is not recommended. *1</p>
<p>To preprocess source code for use with several types of machines, the code should be written in the following manner and be saved in not ".S" but ".c" file.  4.4BSD-Lite actually adopts this way of coding.</p>
<pre>
asm(
    "  asm code0\n"
#ifdef Machine_A
    "  asm code1\n"
#else
    "  asm code2\n"
#endif
    "  ...\n"
);
</pre>
<p>Note:</p>
<p>*1 In FreeBSD 2.0-R, these kernel sources are contained not in *.S but in *.s file.  The Makefile is so defined as to call 'cpp', instead of 'cc', to process them.  Then the 'cc' calls 'as'.  When the 'cpp' is called, '/usr/bin/cpp' is invoked.  '/usr/bin/cpp' is a shell-script that calls '/usr/libexec/cpp -traditional'.  This method was more convenient in that it provides a way to change preprocessors to be used by modifying the script.</p>

<h3><a name="3.9.2" href="#toc.3.9.2">3.9.2. Preprocessing FreeBSD 2/libc Source</a></h3>
<p>I compiled all the source files in <samp>/usr/src/lib/libc</samp> of FreeBSD 2.2.2R.  There was no problem, probably because most of them comes from 4.4BSD-Lite without much modification.  It is quite rare and surprising that a huge amount of source files in excellent quality is gathered together.</p>
<p>Only at one place, I found the following coding in gen/getgrent.c.  Of course, ";" at the end of line is surplus.</p>
<pre>
#endif;
</pre>

<h3><a name="3.9.3" href="#toc.3.9.3">3.9.3. Problems Concerning GCC 2/cpp</a></h3>
<p>As seen so far, writing a Standard-conforming source code with better portability in a more secure manner neither requires much effort nor provides any demerits.  In spite of it, why does source code less conforming to Standards still exist at all?</p>
<p>When comparing the FreeBSD 2.0-R kernel sources with those of 2.2.2-R, non-conforming ones do not decrease in number.  The problem is that newer sources are not necessarily more conforming to the Standards.  There are few non-conforming sources in 4.4BSD-Lite.  This is probably because the 4.4BSD sources were rewritten to become conforming to the Standard C and POSIX.  However, during the process of implementing these sources to FreeBSD, the old writing style revived in some sources.  For example, although the ntohl shown above is written as <samp>ntohl(x)</samp> in 4.4BSD-Lite, it is written as <samp>ntohl</samp> in FreeBSD.  Why did the notation once put away revive?</p>
<p>I blame GCC/cpp for this revival, which passes these non-conforming sources without issuing a diagnostic.  If -pedantic had been a default behavior, the old style source would have never revived.  If -pedantic-errors had been a default behavior, although, GCC/cpp would not have been put into practical use because too many sources failed to be compiled.  The gcc's man page describes the -pedantic option as: "There is no reason to use this option except for satisfying pedants."  Now that eight years have already passed since Standard C was established, it is a high time that GCC/cpp should set -pedantic as default, not go so far as to set -pedantic-errors. *1</p>
<p>In FreeBSD 2.0-R, nested comments were sometimes found, but in 2.2.2-R, they disappeared.  This is because GCC/cpp no longer allowed them.  This has nothing to do with -pedantic, but I want to say how influential preprocessor's source checking is.</p>
<p>Note:</p>
<p>*1 I wrote 3.9.3 subsection in 1998.  After that, gcc's man page or info deleted this expression, however, the specification remains almost the same.</p>

<h3><a name="3.9.4" href="#toc.3.9.4">3.9.4. Preprocessing Linux/glibc 2.1</a></h3>
<p>I compiled glibc (GNU LIBC) 2.1.3 sources (February, 2000).  Different from those of FreeBSD libc, I found many problems.  Some sources are written based on GCC/cpp's undocumented specifications, in which case it took me a lot of time to identify them.</p>

<h4>3.9.4.1. Multi-Line String Literal</h4>
<p><samp>sysdeps/i386/dl-machine.h and stdlib/longlong.h</samp> have many multi-line string literals as shown below:</p>
<pre>
#define MACRO asm("
    instr 0
    instr 1
    instr 2
")
</pre>
<p>Some string literals are very long.  <samp>compile/csu/version-info.h</samp> created by make also has a multi-line string literal.  Of course, it is non-conforming, but GCC treats it as a string literal with embedded &lt;newline&gt;.</p>
<p>The -lang-asm (-x assembler-with-cpp, -a) option allows <b>mcpp</b> to convert a multi-line string literal into the following code:</p>
<pre>
#define MACRO asm("\n  instr 0\n  instr 1\n  instr 2\n")
</pre>
<p>However, this option cannot work properly for a string literal with a directive inserted in the middle as shown in 3.9.1.1, in which case there is no way but to rewrite the source.</p>

<h4>3.9.4.2. #include_next, #warning</h4>
<p>#include_next appears in the following files:</p>
<p><samp>catgets/config.h, db2/config.h, include/fpu_control.h, include/limits.h, include/bits/ipc.h, include/sys/sysinfo.h, locale/programs/config.h, and sysdeps/unix/sysv/linux/a.out.h</samp></p>
<p><samp>sysvipc/sys/ipc.h</samp> has #warning.</p>
<p>Although these directives are not approved by the Standard C, #include_next, in particular, becomes indispensable for glibc 2.  So, <b>mcpp</b> for GCC implements #include_next and #warning.</p>
<p>The problems concerning #include_next is that it is not only a standard violation but also that what headers are actually included depends on the setting of include directories and a search order, which are changed by users via environment variables.</p>
<p>When glibc is installed, some files in glibc's include directory are copied to the /usr/include directory.  These files are used as system header files.  That these header files contain #include_next means system headers become patchy.  It seems to be time to reorganize them.</p>

<h4>3.9.4.3. Variable Argument Macros</h4>
<p>The following files contain definitions of macros with variable number of arguments based on the GCC specification, as well as macro invocations:</p>
<p><samp>elf/dl-lookup.c, elf/dl-version.c, elf/ldsodefs.h, glibc-compat/nss_db/db-XXX.c, glibc-compat/nss_files/files-XXX.c, linuxthreads/internals.h, locale/loadlocale.c,  locale/programs/linereader.h,  locale/programs/locale.c, nss/nss_db/db-XXX.c, nss/nss_files/files-XXX.c, sysdeps/unix/sysdep.h, sysdeps/unix/sysv/linux/i386/sysdep.h, and sysdeps/i386/fpu/bits/mathinline.h</samp></p>
<p>This is a deviation from the C99 Standard.  You must rewrite the source code before you can use <b>mcpp</b>. *1</p>

<p>Note:</p>
<p>*1 This is a spec since GCC2.  There is another spec of GCC3 which is a compromise of GCC2 and C99 specs.  See <a href="#3.9.6.3">3.9.6.3</a>.</p>

<h4>3.9.4.4. Empty Argument During Macro Calls</h4>
<p>The following files have macro invocations with empty arguments:</p>
<p><samp>catgets/catgetsinfo.h, elf/dl-open.c, grp/fgetgrent_r.c, libio/clearerr_u.c, libio/rewind.c, libio/clearerr.c, libio/iosetbuffer.c, locale/programs/ld-ctype.c, locale/setlocale.c, login/getutent_r.c, malloc/thread-m.h, math/bits/mathcalls.h, misc/efgcvt_r.c, nss/nss_files/files-rpc.c, nss/nss_files/files-network.c, nss/nss_files/files-hosts.c, nss/nss_files/files-proto.c, pwd/fgetpwent_r.c, shadow/sgetspent_r.c, sysdeps/unix/sysv/linux/bits/sigset.h, sysdeps/unix/dirstream.h</samp></p>
<p><samp>math/bits/mathcalls.h</samp>, in particular, contains as much as 79 empty arguments.  This header file is installed in <samp>/usr/include/bits/mathcalls.h</samp> and is #included by <samp>/usr/include/math.h</samp>.  Even with an <tt>EMPTY</tt> macro, nested macro invocations generate a lot of empty arguments.  Are there any other ways to write macros more clearly?</p>

<h4>3.9.4.5. Object-Like Macros Replaced with Function-like Macro Name</h4>
<p>The following files contain object-like macro definitions replaced with function-like macro names:</p>
<p><samp>argp/argp-fmtstream.h, ctype/ctype.h, elf/sprof.c, elf/dl-runtime.c, elf/do-rel.h, elf/do-lookup.h, elf/dl-addr.c, io/ftw.c, io/ftw64.c, io/sys/stat.h, locale/programs/ld-ctype.c, malloc/mcheck.c, math/test-*.c, nss/nss_files/files-*.c, posix/regex.c, posix/getopt.c, stdlib/gmp-impl.h, string/bits/string2.h, string/strcoll.c, sysdeps/i386/i486/bits/string.h, sysdeps/generic/_G_config.h, sysdeps/unix/sysv/linux/_G_config.h</samp></p>
<p>Of these, some function-like macros, like the ones in math/test-*.c , are first replaced with an object-like macro name and then further replaced with a function-like macro name.  Why did these macros have to be written in this way?</p>

<h4><a name="3.9.4.6">3.9.4.6. Macros Expanded to 'defined'</a></h4>
<p><samp>sysdeps/generic/_G_config.h, sysdeps/unix/sysv/linux/_G_config.h, and malloc/malloc.c</samp> contain the following macro definition expanded to the "defined" pp-token.</p>
<pre>
#define HAVE_MREMAP defined(__linux__) &amp;&amp; !defined(__arm__)
</pre>
<p>The intention of this macro definition is that with the following directive,</p>
<pre>
#if HAVE_MREMAP
</pre>
<p>the above line is expected to be expanded as follows:</p>
<pre>
#if defined(__linux__) &amp;&amp; !defined(__arm__)
</pre>
<p>However, the behavior is undefined in Standard C when a #if line has a "defined" pp-token in a macro expansion result.  Apart from it, this macro definition is strange in the first place.</p>
<p>The <tt>HAVE_MREMAP</tt> macro is first replaced with the following,</p>
<pre>
defined(__linux__) &amp;&amp; !defined(__arm__)             (1)
</pre>
<p>and then the identifiers <samp>defined, __linux__</samp> and <samp>__arm__</samp> are rescanned for more macro replacement.  If any of them is a macro, it is expanded.  In this case, <samp>defined</samp> cannot be defined as a macro (Otherwise, it causes another undefined result), and if <samp>__linux__</samp> is defined as 1 and <samp>__arm__</samp> is not defined, this macro is finally expanded as follows:</p>
<pre>
defined(1) &amp;&amp; !defined(__arm__)
</pre>
<p>defined(1), of course, is a syntax error of a #if expression.</p>
<p>However, GCC/cpp stops macro expansion at (1) and regards it as the final macro expansion result of the #if line.  Since this is "undefined" anyhow, this GNU specification cannot be described as wrong, but it lacks of consistency in that how to expand a macro differs between macros in a #if line and in other lines.  At least, it lacks of portability. *1</p>
<p>The above code should be written as follows:</p>
<pre>
#if defined(__linux__) &amp;&amp; !defined(__arm__)
#define HAVE_MREMAP 1
#endif
</pre>
<p>I hope this kind of confusing code be eliminated as early as possible. *2</p>
<p>Note:</p>
<p>*1 GCC 2/cpp internally treats <samp>defined</samp> in a #if line as a special macro.  For this reason, when GCC/cpp rescans the following sequence of tokens for macro expansion, it evaluates it as a #if expression, as a result of special handling of <samp>defined</samp> pseudo-macro, instead of expanding the original macro.  In other words, distinction between macro expansion and #if expression evaluation is ambiguous.</p>
<blockquote>
<samp>defined(__linux__) &amp;&amp; !defined(__arm__)</samp>
</blockquote>
<p>This problem relates to GCC/cpp' own program structure.  GCC 2/cpp has a de facto main routine rescan(), which is a macro rescanning routine.  This routine reads and processes source file from the beginning to the end, during the course of which, it calls a preprocessing directive processing routine.  Although implementing everything using macros is a traditional program structure of a macro processor, this structure can be thought to cause mixture of macro expansion and other processing.</p>
<p>*2 In glibc 2.4, this macro was corrected.
Nevertheless, many other macros of the same sort were newly defined.</p>

<h4>3.9.4.7. Preprocessing .S File</h4>
<p>The files named *.S contain assembler source code requiring preprocessing.  Some of these files have preprocessing directives, such as #include, #define, and #if.  In addition, the file named <samp>compile/csu/crti.S</samp> generated by Make contains the following lines:</p>
<pre>
#APP
</pre>
<p>or</p>
<pre>
#NO_APP
</pre>
<p>From a syntax point of view, preprocessors cannot tell whether these lines are invalid preprocessing directives or valid assembler comments.  GCC seems to leave these lines as they are during preprocessing and treat it as assembler comments.</p>
<p>Concatenation of pp-tokens using the ## operator sometimes generates an invalid pp-token.  GCC/cpp outputs these pp-tokens without issuing a diagnostic.</p>
<p>For compatibility with GCC, I reluctantly decided that, with the -lang-asm (-x assembler-with-cpp, -a) option, <b>mcpp</b> does not treat these non-conforming directives and invalid pp-tokens generated by ## as error, and outputs them as they are and issues a warning.</p>
<p>Essentially, these sources should be processed with an assembler macro processor.  GNU seems to provide a macro processor called gasp, but it seems to be scarcely used for some reason.</p>

<h4><a name="3.9.4.8">3.9.4.8. Problems of rpcgen and -dM Option</a></h4>
<p>When invoked with the -dM option, GCC outputs only macro definitions, which is used by stdlib/isomac.c in 'make check' routine.</p>
<p>The problem of the isomac.c is that it accepts only GCC/cpp's macro definition file format and regards a comment or a blank line as an error.</p>
<p>Glibc make sometimes uses a program called rpcgen.  The problem of rpcgen is that it accepts only GCC/cpp's output format of preprocessor line number information as follows:</p>
<pre>
#123 "filename"
</pre>
<p>Rpcgen does accept neither:</p>
<pre>
#line 123
</pre>
<p>nor</p>
<pre>
#line 123 "filename"
</pre>
<p>Rpcgen regards them as error.</p>
<p>I reluctantly decided that GCC-specific-<b>mcpp</b> uses the GCC format by default.  Rpcgen's specification is poor in that it is based on a particular compiler system's format and cannot accept the standard one.</p>

<h4>3.9.4.9. -include, -isystem and -I- Options</h4>
<p>Glibc 2.1 'makefile' often uses the -include option and sometimes uses -isystem and -I- options.  The former can be substituted with #include at the beginning of source code.  The latter two are less necessary; these are only necessary to update system headers.</p>
<p>Only GCC-specific-build of <b>mcpp</b> implements these two options, but I would like these less necessary options to be made obsolete. *1</p>
<p>Note:</p>
<p>*1 GCC/cpp provides several more options that specify include directories and their search orders, such as -iprefix, -iwithprefix, and -idirafter.  It also provides the -remap option that specifies mapping between long-file-names and MS-DOS 8+3 format filenames.  On CygWIN systems, specs files contain these options, but it is not necessary to use these options because include directories can be specified with environment variables and because such mapping is no longer necessary on CygWIN.</p>

<h4>3.9.4.10. Undocumented Predefined Macros</h4>
<p>This is not a problem of glibc, but of GCC.  The following macros are GCC/cpp predefined macros although their names do not appear in documentation.</p>
<pre>
__VERSION__,  __SIZE_TYPE__,  __PTRDIFF_TYPE__, __WCHAR_TYPE__
</pre>
<p>On Vine Linux 2.1 (egcs-1.1.2) systems, <tt>__VERSION__</tt> is set to "egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)".  On many systems, including Linux/i386, the values of other three macros have types unsigned int, int, and long int, respectively.  However, on FreeBSD and CygWIN systems, their types are slightly different from them (I do not know why).  Why does those predefines macros remain undocumented?</p>

<h4>3.9.4.11. Undocumented Environment Variables</h4>
<p>The most strange thing is the undocumented environment variable named <samp>SUNPRO_DEPENDENCIES</samp>. <samp>sysdeps/unix/sysv/linux/Makefile</samp> contains the following script:</p>
<pre>
SUNPRO_DEPENDENCIES='$ (@:.h=.d)-t $@' \
$ (CC) -E -x c $ (sysinclude) $&lt; -D_LIBC -dM | \
... \
etc.
</pre>
<p>The intent of this script is to specify a file name with the environment variable <samp>SUNPRO_DEPENDENCIES</samp>, and to have cpp output macro definitions in source code and dependency description lines between source files to that file.</p>
<p>I had no other way but to read the GCC/cpp source code (egcs-1.1.2/gcc/cccp.c) to know how this environment variable works.</p>
<p>In addition, there is another environment variable, <samp>DEPENDENCIES_OUTPUT</samp>, which has a similar function.  The difference between the two is that <samp>SUNPRO_DEPENDENCIES</samp> also outputs dependency description lines among system headers, but <samp>DEPENDENCIES_OUTPUT</samp> does not.</p>
<p>Only GCC-specific-build of <b>mcpp</b> enables these two environment variables, but I would like these undocumented specifications to be made obsolete as early as possible.</p>

<h4>3.9.4.12. Other Problems</h4>
<p>Linux (i386)/GCC 2 appends the -Asystem(unix), -Acpu(i386) or -Amachine(i386) to cpp invocation options by using specs file.  As long as the glibc 2.1.3 for Linux/x86 is concerned, there seems to be no source code that utilizes this functionality.</p>
<p>It is a big problem that glibc's system headers have become patchy and very complicated.  A small difference in settings may result in a big difference in preprocessing results.</p>
<p>On the other hand, Glibc 2.1.3 did not contain #else junk, #endif junk, or duplicate macro definitions that were found in FreeBSD 2.2.2/kernel sources.  In some aspects, Glibc 2.1 source is better organized than FreeBSD 2/kernel source.</p>
<p>However, as a whole, there were not a few sources that are based on GCC-specific specifications in glibc 2.1, which impairs portability to other compiler systems although such sources form only a small portion of several thousand source files.  Dependence on GCC local specifications is not desirable for program readability and maintainability.  I hope that GCC V.3 will make obsolete these local specifications and that all the source code based on them will be completely rewritten.</p>

<h3><a name="3.9.5" href="#toc.3.9.5">3.9.5. To Use <b>mcpp</b> with GCC 2</a></h3>
<p>You must modify some source code as follows before you can use <b>mcpp</b> to compile glibc 2.1 sources: *1</p>
<ol>
<li>Macro definitions with variable number of arguments: Modify the 14 files in 3.9.4.3 as shown in 3.9.1.6.  Of course, you had better save the original files.<br>
<br>
<li>Macros contained in the three files shown in 3.9.4.6 that has <samp>defined</samp> in its replacement list:  <samp>/usr/include/_G_config.h</samp> is a file generated when <samp>sysdeps/unix/sysv/linux/_G_config.h</samp> is installed and has the same contents with this.  You had better modify <samp>/usr/include/_G_config.h.</samp><br>
</ol>
<p>In addition to the options specified in Makefile or specs file, you must specify the -lang-asm (-xassembler-with-cpp) option to process *.S files containing multi-line string literals or assembler comments before you can invoke <b>mcpp</b>.  Usually, you can leave this option specified when preprocessing other files.</p>
<p>When you want to use GCC/cpp or <b>mcpp</b>, or change the default options, you had better perform the following steps:</p>
<ol>
<li>Become a super-user to move to the directory where cpp resides (here assuming <samp>/usr/lib/gcc-lib/i386-redhat-linux/egcs-2.91.66</samp>).  Let me assume that this directory has GCC/cpp installed under the name of cpp and <b>mcpp</b> as mcpp.<br>
<br>
<li>Create a file called mcpp.sh with the following contents. *2<br>
<pre>
#!/bin/sh
/usr/lib/gcc-lib/i386-redhat-linux/egcs-2.91.66/mcpp -Q -lang-asm "$@"
</pre>
The -Q options are optional, however, I recommend that you should use -Q to record a large amount of diagnostic messages.<br>
<br>
<li>Enter the following commands:<br>
<pre>
chmod a+x mcpp.sh
mv cpp cpp_gnuc
ln -sf mcpp.sh cpp
</pre>
These commands execute mcpp.sh linked to cpp when gcc calls cpp, and mcpp.sh calls <b>mcpp</b> using the above options before the ones specified by gcc.<br>
<br>
<li>To change default options, modify mcpp.sh or call mcpp directly.  To use GCC/cpp do:<br>
<pre>
ln -sf cpp_gnuc cpp
</pre>
</ol>
<p>Note:</p>
<p>*1 <b>mcpp</b> V.2.7 implemented these specs.
Hence, editing of sources are not necessarily required.</p>
<p>*2 If you use 'configure' and 'make' to compile GCC-specific-build of <b>mcpp</b>, the 'make install' command will set the script appropriately.  The only thing left for you here is to add '-Q -lang-asm' options to the script.</p>

<h4><a name="3.9.5.1" href="#toc.3.9.5.1">3.9.5.1. To Sort <b>mcpp</b>'s Warnings</a></h4>
<p>Another problem of using <b>mcpp</b> is that it issues a huge amount of warning messages.  You can redirect them to a file using the -Q option, but when you preprocess a large amount of source code, such as glibc, total of several hundred MB or more of 'mcpp.err' are created, so it is impossible for you to look through the whole files.</p>
<p>Taking a close look at mcpp.err, you will find same warnings being issued repeatedly.  This is because the same *.h files are #included by many source programs.  To make the files more readable, perform the following procedure:</p>
<ol>
<li>To find error messages, enter the following command:<br>
<pre>
grep 'fatal:' `find . -name mcpp.err`
grep 'error:' `find . -name mcpp.err`
</pre>
<li>To sort warning messages, enter the following command:<br>
<pre>
grep 'warning:' `find . -name mcpp.err` | sort -k3 -u > mcpp-warnings-sorted
</pre>
<li>To find all the source lines causing a 'warning:', enter the following command:<br>
<pre>
grep 'warning:' `find . -name mcpp.err` | sort -k3 | uniq > mcpp-warnings-all
</pre>
<li>To find a particular type of 'warning:'s, enter the following command, for example:<br>
<pre>
grep 'warning: Replacement' `find . -name mcpp.err` | sort -k3 | uniq | less
</pre>
After you get an overall idea of what source lines are causing what kinds of errors or warnings, you can see a particular mcpp.err by "less" and then, if necessary, see the source file in question.<br>
<br>
In addition, you can sandwich the source code in question with '#pragma MCPP debug expand' and '#pragma MCPP end_debug' and preprocess it again to see the output, in which case I recommend you to invoke <b>mcpp</b> in the following manner so that preprocessing results and diagnostic messages are output to the same file:<br>
<pre>
mcpp &lt;-opts&gt; in-file.c &gt; in-file.i 2&gt;&amp;1
</pre>
When you use "make", you must temporarily change the above shell-script.<br>
</ol>

<h3><a name="3.9.6" href="#toc.3.9.6">3.9.6. Preprocessing GCC 3.2 Source</a></h3>
<p>I first compiled GCC 3.2 sources on Linux and FreeBSD, then I used the generated gcc to compile <b>mcpp</b> and then I recompiled GCC 3.2 sources using <b>mcpp</b> for preprocessing.</p>
<p>New GCC compilers are bootstrapped during various phases of make; gcc and cc1, etc generated in an earlier phase are used to recompile themselves, and those generated compiler drivers and compiler-propers are used again to recompile themselves, and so on.  During the bootstrap, gcc exists under the name of xgcc.</p>
<p>Other than cc1 and cc1plus, GCC 2 has a separate preprocessor called cpp.  In GCC 3, cpp was absorbed into cc1 and cc1plus.  However, there still exists a separate preprocessor cpp0.  To have cpp0 preprocess, the -no-integrated-cpp option must be specified when you invoke gcc or g++.  Therefore, to have <b>mcpp</b> preprocess, you must use a shell-script that have gcc (xgcc) or g++ invoke <b>mcpp</b> first then invoke cc1 or cc1plus. *1</p>
<p>In the GCC compiler system, the settings of system headers and their search order are becoming very complex.  So, a small difference in settings may result in a difference in preprocessing results.  Even successful compilation was often difficult to attain.  In addition, compilation and tests require a lot of other software.  Older versions of such software may cause failure in compilation or tests.  Actually, compilation sometimes failed due to some hardware problems on my machine.</p>
<p>Actually, I failed to compile GCC 3.2 source under FreeBSD 4.4R.  I had to upgrade FreeBSD to 4.7R and changed software packages to those for FreeBSD 4.7R before I was able to succeed in compilation. *2</p>
<p>I used VineLinux 2.5 on two PCs.  Although compilation of GCC 3.2 sources using GCC 2.95.3 was successful on one PC (K6/200MHz), recompilation of GCC 3.2 sources using the generated GCC 3.2/cc1 failed, and caused many segmentation faults.  Then I changed CPU from K6 to AthlonXP.  This time, recompilation was successful; no segmentation faults occurred.  Hardware may have caused the problem.</p>
<p>When I compiled GCC 3.2 sources using GCC 2.95.4 under FreeBSD on K6, "make -k check" of the generated gcc was almost successful.  When I recompiled GCC 3.2 itself using the generated GCC 3.2, in "make -k check" of g++ and libstdc++-v3, about 20 percent of testsuite was unsuccessful.  However, when using AthlonXP, instead of K6, everything went OK.  Hardware may have caused the problem.</p>
<p>On both VineLinux PCs, when I recompiled GCC 3.2 sources using GCC 3.2 itself and <b>mcpp</b>, "make -k check" of the generated gcc was successful.  However, in "make -k check" of g++ and libstdc++-v3, 20 percent of testsuite failed.  *3, *4, *5</p>
<p>In anyway, the cause of this testsuite failure seems to lie not in the generated compilers themselves, such as gcc, g++, cc1 and cc1plus, but in the header files or some other settings.</p>
<p><b>mcpp</b> cannot be described as completely compatible with GCC/cpp, but is highly compatible.  So, <b>mcpp</b> and GCC/cpp can be used interchangeably.</p>
<p>GCC 3.2 sources were compiled in the following environment:</p>
<blockquote>
<table>
  <tr><th>OS           </th><td>make    </td><td>library    </td><td>CPU</td></tr>
  <tr><th>VineLinux 2.5</th><td>GNU make</td><td>glibc 2.2.4</td><td>Celeron/1060MHz</td></tr>
  <tr><th>VineLinux 2.5</th><td>GNU make</td><td>glibc 2.2.4</td><td>K6/200MHz, AthlonXP/2.0GHz</td></tr>
  <tr><th>FreeBSD 4.7R </th><td>UCB make</td><td>libc.so.4  </td><td>K6/200MHz, AthlonXP/2.0GHz</td></tr>
</table>
</blockquote>
<p>Only C and C++ were compiled.</p>
<p>Note:</p>
<p>*1 I had to do this for each bootstrap stage.  Since makefile is too large and too complex to change, I employed an inelegant method; I kept on sitting in front of PC screen during the entire process of bootstrap.  At each end of the stages, I entered ^C and replaced xgcc and others with shell-scripts.</p>
<p>*2 Due to dependency between packages, the system falls into confusion unless appropriate versions are installed.  Actually, for this reason, my FreeBSD temporarily failed to invoke kterm.</p>
<p>*3 "make -k check" cannot be used with <b>mcpp</b> because diagnostics of <b>mcpp</b> are different from those of GCC.</p>
<p>*4 "make -k check" seems to require an English environment, so the LANG environment variable should be set to C.</p>
<p>*5 All the testsuite failures were caused by inability of the pthread_* functions, such as pthread_getspecific and pthread_setspecific, to be linked in the library <samp>i686-pc-linux-gnu/libstdc++-v3/src/.libs/libstdc++.so.5.0.0</samp>.  When a correctly generated library was installed, "make -k check" was successful.  On FreeBSD, this problem never happened.  This is probably because of small differences in settings.</p>

<h4>3.9.6.1. Multi-Line String Literal</h4>
<p>This very old way of coding was no longer found in GCC 3.2 sources.  Multi-line string literals were made obsolete as late as at GCC 3.2.  GCC 3.2 processes a source with a multi-line string literal as you expect, but issues a warning.</p>

<h4>3.9.6.2. #include_next and #warning</h4>
<p>limits.h and syslimits.h in <samp>build/gcc/include</samp> generated during the course of make have #include_next.  When GCC 3.2 is installed, these header files are copied to limits.h and syslimits.h in <samp>lib/gcc-lib/i686-pc-linux-gnu/3.2/include</samp>.</p>
<p>GCC 3.2 sources does not have #warnings.</p>

<h4><a name="3.9.6.3">3.9.6.3. Variable Argument Macros</a></h4>
<p>GCC 3.2 sources have some variable argument macros, but most of them are found in testsuite and they are nothing but test samples.  Although GCC 3.2 still supports variable argument macros in GCC 2 notation, the ones using <samp>__VA_ARGS__</samp> (in C99 notation) are more frequently found in GCC 3.2 sources.</p>
<p>In GCC 3, variable argument macros in a mixed notation of GCC 2 and C99 are found: *1</p>
<pre>
#define eprintf( fmt, ...)   fprintf( stderr, fmt, ##__VA_ARGS__)
</pre>
<p>This definition corresponds to the following one of GCC 2 spec.</p>
<pre>
#define eprintf( fmt, args...)   fprintf( stderr, fmt, ##args)
</pre>
<p>According to the GCC specifications, in the absence of an argument corresponding to "<samp>...</samp>", the comma immediately before "<samp>##</samp>" is deleted.  So, this is expanded as follows:</p>
<pre>
eprintf( "success!\n")  ==&gt;  fprintf( stderr, "success!\n")
</pre>
<p>As far as this example is concerned, this specification seems to be convenient, but is not desirable in that (1) a comma in a replacement list of a macro definition is not always used to delimit parameters, (2) it allows a token concatenation operator (##) to have other functionality, (3) it makes rules more complex by allowing exceptions. *2, *3, *4</p>

<p>Note:</p>
<p>*1 This manual calls the variadic macro of specification since GCC 2 as GCC2 spec, and that of created in GCC 3 as GCC3 spec.</p>
<p>*2 While on GCC 2.* <samp>'args...'</samp> in definition of GCC2 spec variadic macro should not be separated as <samp>'args ...'</samp>, on GCC 3 the intervening spaces are allowed.</p>
<p>*3 When -ansi option (or any of -std=c* or -std=iso* option) is specified, GCC, however, does not remove the comma even if the variable argument is absent.
Nevertheless, the '##' disappears silently.</p>
<p>*4 <b>mcpp</b> V.2.6.3 implemented variadic macro of GCC3 spec for <i>STD</i> mode on GCC-specific-build only.
V.2.7 even implemented GCC2 spec one.</p>

<h4>3.9.6.4. Empty Arguments in Macro Invocation</h4>
<p>Apart from #include-ed system headers, such as <samp>/usr/include/bits/mathcalls.h and /usr/include/bits/sigset.h</samp>, empty arguments in a macro invocation are found only in gcc/libgcc2.h of GCC 3.2 sources themselves. *1</p>
<p>Note:</p>
<p>*1 These two header files are copied into the system header directory when glibc is installed.  They do not exist on FreeBSD because glibc is not used.</p>

<h4>3.9.6.5. Object-Like Macros Replaced with Function-Like Macros</h4>
<p><samp>gcc/fixinc/gnu-regex.c and libiberty/regex.c</samp> have object-like macros that are replaced with function-like macro name.  <samp>/usr/lib/bison.simple</samp>, a #included file, also has such macros.  These macros are all relevant to alloca.  For example, <samp>libiberty/regex.c</samp> has the following macro definitions.</p>
<pre>
#define REGEX_ALLOCATE  alloca
#define alloca( size)   __builtin_alloca( size)
</pre>
<p>This should be written as follows:</p>
<pre>
#define REGEX_ALLOCATE( size)   alloca( size)
</pre>
<p>Why did they omit (size)?</p>
<p>In addition, regex.c also has another alloca, which is defined as follows:</p>
<pre>
#define alloca  __builtin_alloc
</pre>
<p>Their writing style is inconsistent.</p>
<p>Furthermore, regex.c has a #include "regex.c" line, which is including itself.  regex.c is a strange and unnecessarily complicated source.</p>

<h4>3.9.6.6. Macros Expanded to 'defined'</h4>
<p>GCC 3.2 sources do not have macros expanded to 'defined'.  According to GCC 3.2 documents, this type of macro is preprocessed in the same way as GCC 2/cpp, but GCC 3.2 issues a warning to indicate "may not portable".  However, GCC 3.2 does not issue a warning to an example shown in 3.9.4.6.</p>

<h4>3.9.6.7. Preprocessing of .S Files</h4>
<p>cpp.info of GCC 3 says:</p>
<blockquote>Wherever possible, you should use a preprocessor geared to the
language you are writing in.  Modern versions of the GNU assembler have
macro facilities.</blockquote>
<p>However, the <samp>gcc/config</samp> directory has several *.S files.</p>

<h4>3.9.6.8. rpcgen and -dM Option</h4>
<p>Make of GCC 3.2 uses neither rpcgen nor -dM option.  However, specifications of rpcgen and the -dM option do not seem to change from the previous versions.</p>

<h4>3.9.6.9. -include, -isystem and -I- Options</h4>
<p>These options are frequently used in make of GCC 3.2.  Sometimes, the -isystem option is used to specify several system include directories at one time.  Is it inevitable to use the option during software compilation that updates system headers themselves?  I think they had better use an environment variable to specify all the system include directories.</p>
<p>On the other hand, GCC 3/cpp documents discourage to use the -iwithprefix and -iwithprefixbefore options.  GCC provides many options to specify include directories.  Does GCC 3.2 move toward reorganization or reduction in number of them? *1</p>
<p>Note:</p>
<p>*1  GCC 3.2 Makefile uses the -iprefix option in a stand alone manner (without using -iwithprefix or -iwithprefixbefore), although the -iprefix option makes sense only when used with one of these two options following it.</p>

<h4>3.9.6.10. Undocumented Predefined Macros</h4>
<p>GCC 2 did not document predefined macros, such as <tt>__VERSION__</tt>, <tt>__SIZE_TYPE__</tt>, <tt>__PTRDIFF_TYPE__</tt> and <tt>__WCHAR_TYPE__</tt>.  Even with the -dM option, their existence was unknown.  GCC 3 not only documents them but also enhances -dM to show their definitions.</p>

<h4>3.9.6.11. Undocumented Environment Variables</h4>
<p>GCC 3 documents the <tt>SUNPRO_DEPENDENCIES</tt> environment variable GCC 2 did not.  (I do not know why this environment variable is needed.)</p>

<h4>3.9.6.12. Other Problems</h4>
<p>GCC 3 implements following #pragmas:</p>
<pre>
#pragma GCC poison
#pragma GCC dependency
#pragma GCC system_header
</pre>
<p>Of these, GCC 3.2 sources use poison and system_header.  <b>mcpp</b> does not support these #pragmas because I do not think them necessary. (I omit explanation of their specifications.) *1</p>
<p>GCC 3 deprecates assertion directives, such as #assert, although gcc, by default, specifies the -A option.</p>
<p>In GCC 2, the -traditional option is implemented in one and the same cpp, result in a strange mixture of very old specifications and C99 ones.  In GCC 3, its preprocessor was divided into two: non-traditional cpp0 and tradcpp0. The -traditional option is valid only for gcc.  cpp0 does not provides it.  gcc -traditional invokes tradcpp0 for preprocessing.</p>
<p>tradcpp0 is getting closer to a true traditional preprocessor before C90. They say that they no longer maintain tradcpp0 except for serious bugs.</p>
<p>The strange specifications of GCC 2/cpp seem to have been significantly revised.</p>
<p>Note:</p>
<p>*1 <b>mcpp</b> V.2.7 onward supports <samp>#pragma GCC system_header</samp> on GCC-specific-build.</p>

<h3><a name="3.9.7" href="#toc.3.9.7">3.9.7. To Use <b>mcpp</b> with GCC 3 or 4</a></h3>
<p>As seen above, as far as preprocessing is concerned, GCC 3.2 sources have been much improved than glibc 2.1.3 sources in that the traditional way of writing has been almost eliminated and that meaningless options are no longer used.</p>
<p>GCC 3.2/cpp0 itself is also much superior to GCC 2/cpp in that it regards traditional specifications as obsolete and articulates the token-based principle.  Undocumented specifications have been significantly reduced.  Although these improvements are not still sufficient, GCC is certainly moving toward the right direction.</p>
<p>However, GNU / Linux system headers become so complex that it is difficult to grasp their entire structure, which may one of the biggest causes of problems in the GNU / Linux system.</p>
<p>Another pitiful fact is that the preprocessor is absorbed into the compiler-proper.  Therefore, to use <b>mcpp</b>, the -no-integrated-cpp option must be specified when invoking gcc or g++.  If you compile a large amount of source files with complicated or many makefiles, or if some program automatically invoke gcc, you should create a shell-script that invokes gcc or g++ with the -no-integrated-cpp option automatically specified.</p>
<p>Let me take an example of this.  Place the following shell-scripts in the directory where gcc and g++ reside (on my Linux, <samp>/usr/local/gcc-3.2/bin</samp>), under the names of gcc.sh and g++.sh, respectively.</p>
<pre>
#!/bin/sh
/usr/local/gcc-3.2/bin/gcc_proper -no-integrated-cpp "$@"

#!/bin/sh
/usr/local/gcc-3.2/bin/g++_proper -no-integrated-cpp "$@"
</pre>
<p>Move to this directory and enter the following commands:</p>
<pre>
chmod a+x gcc.sh g++.sh
mv gcc gcc_proper
mv g++ g++_proper
ln -sf gcc.sh gcc
ln -sf g++.sh g++
</pre>
<p>In the directory where cpp is located (on my Linux, <samp>/usr/local/gcc-3.2/lib/gcc-lib/i686-pc-linux-gnu/3.2</samp>), create a script that executes <b>mcpp</b> when cpp0 is invoked, as you did for GCC 2 (See <a href="#3.9.5)"> 3.9.5)</a>.  By doing this, gcc or g++ first invokes <b>mcpp</b> and then invokes cc1 or cc1plus with the -fpreprocessed option appended.  -fpreprocessed indicates the source has been preprocessed already. *1</p>
<p>Note that when a GCC version other than the system standard one is installed, additional include directory settings may be required.  <b>mcpp</b> embeds these settings when <b>mcpp</b> itself is compiled, thus eliminating the need to set them with environment variables.</p>
<p>If possible, I want to replace the cpplib source, the preprocessing part of cc1 or cc1plus, with <b>mcpp</b>.  The source files that define the internal interface between cpplib and ccl or cc1plus, as well as the external interface between cpplib and user programs that use it, amount to as much as 46KB.  It is impossible to replace.  Why is the interfaces so complex?  It is a pity.</p>
<p>Note:</p>
<p>*1 <b>mcpp</b> gets all the necessary informations by 'configure' and sets these scripts by 'make install'.</p>

<h4>3.9.7.1. To Use <b>mcpp</b> with GCC 3.3 and 3.4-4.1</h4>
<p>Although GCC 3.2 seemed to go in the direction of better portability, GCC turned its direction to a different goal on 3.3 and 3.4.  V.3.3 and 3.4 differ from 3.2 in the following points.</p>
<ol>
<li>Independent preprocessor cpp0 was abolished.  The execution option '-no-integrated-cpp' changed its meaning, gcc invokes cc1 (cc1plus) instead of cpp0 as a preprocessor even if this option is specified, and gcc passes to the preprocessor some options which are irrelevant to preprocessing. (What a dirty implementation!)<br>
<li>Many (several dozen of) macros are predefined.  The relationship between the system headers and GCC became more complicated.<br>
<li>Tradcpp was also abolished and absorbed to an execution option of cc1.  Some old specifications, which were obsoleted or deprecated in V.3.2, were restored.<br>
</ol>
<p>GCC / cc1 is becoming one huge and complex compiler absorbing preprocessor and some system header's contents.  I doubt whether this is a better way of compiler construction, especially of developing open source one.</p>
<p>As regards <b>mcpp</b>, it is a nuisance that gcc arbitrarily hands to preprocessor some irrelevant options.  Since it is risky to ignore all the options unrecognized by <b>mcpp</b>, I didn't adopt this.  Although <b>mcpp</b> ignores the wrong options such as -c or -m* which are frequently handed from gcc, it will get an error if other unexpected options are passed on.</p>
<p>In order to avoid conflicts with those wrong options, <b>mcpp</b> V.2.5 changed some options, -c to -@compat, -m to -e, and some others.</p>
<p>To use <b>mcpp</b> with GCC 3.2 or former, it is necessary only to replace invoking of cpp0 by <b>mcpp</b>.  To use <b>mcpp</b> with GCC 3.3 or later, it is necessary to divide invoking of cc1 to <b>mcpp</b> and cc1.  src/set_mcpp.sh will write shell-scripts for this purpose in the GCC libexec directory on <b>mcpp</b> installation.  The 'make install' command will also get GCC predefined macros using -dM option and set those for <b>mcpp</b>. *1, *2, *3</p>
<p>In addition, GCC 3.4 changed processing of multi-byte characters.  Its document says as: *4</p>
<ol>
<li>It converts every encodings of multi-byte characters to UTF-8 at the first phase of preprocessing.<br>
<li>It uses libiconv functions for the conversion, therefore it can handle all the encodings iconv can do.<br>
<li>It has '-finput-charset=&lt;encoding&gt;' option to specify the source file's encoding.  (In other words, the encoding is not converted without this option.)<br>
<li>It has also '-fexec-charset=&lt;encoding&gt;' option to specify the output encoding which defaults to UTF-8. *5<br>
</ol>
<p>There is a trend to identify "internationalization" with "unicodization", especially in the Western people who do not use multi-byte characters.  It seems that this trend has reached to GCC.</p>
<p>What is worse, GCC 3.4 or later does not implement their specification sufficiently.  In actual, it behaves as:</p>
<ol>
<li>As for EUC-JP, GB2312, KSC-5601 and Big5, it converts these encodings to UTF-8 correctly with -finput-charset option, and it passes them as they are without this option. *6<br>
<li>The -fexec-charset option has no effect on V.3.4 nor V.4.0.  On V.4.1-4.3, the option has effect, and works correctly.<br>
<li>With ISO2022-JP, GCC cannot preprocess on V.3.4 or 4.0, whereas on V.4.1-4.3 it can preprocess the encoding.<br>
<li>As for shift-JIS, all the versions confuse in preprocessing when -finput-charset is specified.<br>
</ol>
<p><b>mcpp</b> takes -e &lt;encoding&gt; option to specify an encoding, and the GCC-specific-build inserts &lt;backslash&gt; to the byte in multi-byte character which has the same value with &lt;backslash&gt;, '"' or '\'', when the encoding is one of BIG-5, shift-JIS or ISO2022-JP, in order to complement GCC's inability.  However, it does not convert the encoding to UTF-8.  <b>mcpp</b> also treats -finput-charset as the same option as -e.  I adopted these specifications because: *7</p>
<ol>
<li>As for shift-JIS, GCC of any version cannot process it if -f*-charset options are specified.
But, as for shift-JIS, ISO2022-JP and Big5, if the encodings are not converted but supplemented with &lt;backslash&gt;es, the multi-byte characters are output as they are by cc1 on any version of GCC.
Also EUC-JP, GB2312, KSC-5601 and UTF-8 are passed as they are, if -f*-charset are not specified.
That is to say, the multi-byte characters are treated as single-byte character sequences.<br>
<li>GCC up to V.4.0 / cc1 does not convert back from UTF-8 to the original encodings.<br>
<li>I hope that GCC will change the multi-byte character handling in the near future.<br>
</ol>
<p>Note:</p>
<p>*1 The output of -dM option, however, slightly differs each other depending on other options.
What is worse, most of the predefined macros are undocumented ones.
As a result, the whole picture cannot be grasped easily.</p>
<p>*2 MinGW does not support symbolic link.  Though the 'ln -s' command exists, it does not link but only copy.  Moreover, MinGW's GCC rejects to invoke a shell-script even if it is named cc1.  To cope with this, <b>mcpp</b>'s MinGW GCC-specific-build generates a binary executable named cc1.exe (copied also to cc1plus.exe) which invokes mcpp.exe or GCC's cc1.exe/cc1plus.exe.</p>
<p>*3 CygWIN / GCC has -mno-cygwin option which alters system include directory and alters GCC's predefined macros.  <b>mcpp</b> V.2.6.1 onward, CygWIN GCC-specific-build supports this option and generates two sets of header files for the predefined macros.</p>
<p>*4 On GCC in my FreeBSD 6.3, multi-byte character conversion to UTF-8 does not work at all, though libiconv seems to be linked to them.
It was the same with FreeBSD 5.3 and 6.2, too.</p>
<p>*5 This conversion seems not to be done in preprocessing phase, but in compilation phase.
Output of -E option is still UTF-8.</p>
<p>*6 GCC V.4.1-4.3 fail to compile due to a bug of GCC, if -save-temps or -no-integrated-cpp option is specified at the same time with -f*-charset option.</p>
<p>*7 When you pass the output of <b>mcpp</b> to cc1, you should not specify -fexec-charset option nor -finput-charset option.</p>

<h3><a name="3.9.8" href="#toc.3.9.8">3.9.8. Preprocessing Linux/glibc 2.4</a></h3>

<p>I compiled glibc 2.4 (March, 2006) source, and checked preprocessing problems in it.
As a compiler system, I used GCC 4.1.1 with <b>mcpp</b> 2.6.3.
Since my machine is x86 type, I did not check the codes for other CPUs.</p>
This is a six years newer version than glibc 2.1.3 (February, 2000) which I checked formerly, so it has naturally some parts largely changed from the old version.
However, It has remarkably many parts unchanged.
On the whole, most of the problems I noticed in the old version have not been revised, on the contrary, unportable sources have increased.</p>

<h4>3.9.8.1. Multi-Line String Literal</h4>

<p>The old-fashioned "multi-line string literal" has disappeared.</p>

<h4>3.9.8.2. #include_next, #warning</h4>

<p>#include_next is found in the following source files.
Its occurrence has increased as compared with the six years older version.</p>

<p><samp>
catgets/config.h,
elf/tls-macros.h,
include/bits/dlfcn.h,
include/bits/ipc.h,
include/fpu_control.h,
include/limits.h,
include/net/if.h,
include/pthread.h,
include/sys/sysctl.h,
include/sys/sysinfo.h,
include/tls.h,
locale/programs/config.h,
nptl/sysdeps/pthread/aio_misc.h,
nptl/sysdeps/unix/sysv/linux/aio_misc.h,
nptl/sysdeps/unix/sysv/linux/i386/clone.S,
nptl/sysdeps/unix/sysv/linux/i386/vfork.S,
nptl/sysdeps/unix/sysv/linux/sleep.c,
sysdeps/unix/sysv/linux/ldsodefs.h,
sysdeps/unix/sysv/linux/siglist.h
</samp></p>

<p>Though the following is not a part of glibc itself but a testcase file to test glibc by 'make check', #include_next is found also in it.</p>

<p><samp>sysdeps/i386/i686/tst-stack-align.h</samp></p>

<p>#warning appears in <samp>sysvipc/sys/ipc.h</samp>.
This directive is in a block to be skipped in normal processing, and does not cause any problem.</p>

<h4>3.9.8.3. Variable Argument Macros</h4>

<p>There are definitions of variable argument macros in the following files.
All of these are that of the old spec since GCC2.
There is not any macro of C99 spec, nor even GCC3 spec one.</p>

<p><samp>elf/dl-lookup.c,
elf/dl-version.c,
include/libc-symbols.h,
include/stdio.h,
locale/loadlocale.c,
locale/programs/ld-time.c,
locale/programs/linereader.h,
locale/programs/locale.c,
locale/programs/locfile.h,
nptl/sysdeps/pthread/setxid.h,
nss/nss_files/files-XXX.c,
nss/nss_files/files-hosts.c,
sysdeps/generic/ldsodefs.h,
sysdeps/i386/fpu/bits/mathinline.h,
sysdeps/unix/sysdep.h,
sysdeps/unix/sysv/linux/i386/sysdep.h</samp></p>

<p>The following testcase files also have variadic macro definitions of GCC2 spec.</p>

<p><samp>localedata/tst-ctype.c,
posix/bug-glob2.c,
posix/tst-gnuglob.c,
stdio-common/bug13.c</samp></p>

<p>Moreover, many of the calls of these macros lack actual argument for variable argument.
As much as 142 files have such macro calls lacking variable argument, and 120 files of them have such unusual macro calls as the replacement list of which have ", ##" sequence immediately preceding variable argument and hence removal of the ',' happen.</p>
<p>As a variable argument macro specification, C99 one is portable and is recommendable.
However, it is not so easy to rewrite GCC specs macro to C99 one.
Both of GCC2 spec and GCC3 spec variadic macros do not necessarily correspond to C99 spec one-to-one, because GCC specs cause removal of the preceding comma in case of absence of variable argument.
If you rewrite GCC spec macro definition to C99 one, you also need to rewrite macro calls of absent variable argument and supplement an argument.</p>
<p>In glibc 2.1.3, GCC2 spec macros were not so many, and it was not a heavy work for a user to rewrite them with an editor.
In glibc 2.4, however, such macro definitions increased and especially their calls vastly increased.
As a consequence, it is impossible now for a user to rewrite them.</p>
<p>To cope with this situation, <b>mcpp</b> V.2.6.3 onward implemented GCC3 spec variadic macro for GCC-specific-build only.
Furthermore, <b>mcpp</b> V.2.7 implemented GCC2 spec one, too.
However, you should not write GCC2 spec macro in your sources, because the spec is too deviant from token-based principle.
Since GCC2 spec corresponds to GCC3 spec one-to-one, it is easy to rewrite a macro definition to GCC3 spec, and call of that macro need not be rewritten.
The already written macros with GCC2 spec will become a little clearer, if rewritten this way. *1</p>
<p>To rewrite a GCC2 spec variadic macro to GCC3 spec one, for example, change:</p>

<pre>
#define libc_hidden_proto(name, attrs...)   hidden_proto (name, ##attrs)
</pre>

<p>to:</p>

<pre>
#define libc_hidden_proto(name, ...)    hidden_proto (name, ## __VA_ARGS__)
</pre>

<p>That is, change the parameter <samp>attrs...</samp> to <samp>...</samp>, and change <samp>attrs</samp> in the replacement-list to <samp>__VA_ARGS__</samp>.</p>

<p>Note:</p>
<p>*1 As for variadic macro of GCC2 spec and GCC3 spec, see <a href="#3.9.1.6">3.9.1.6</a>, <a href="#3.9.6.3">3.9.6.3</a> respectively.</p>

<h4>3.9.8.4. Empty Argument During Macro Calls</h4>

<p>The macro calls with any empty argument are found in as many as 488 source files.
They have greatly increased since the old version.
C99 approval of empty macro argument may have influenced the tendency.</p>

<p>In particular, <samp>math/bits/mathcalls.h</samp> has as many as 79 macro calls with empty argument.
That is the same with the old version.</p>

<h4>3.9.8.5. Object-Like Macros Replaced with Function-like Macro Name</h4>

<p>The following files have object-like macro definitions replaced to function-like macro names:</p>

<p><samp>argp/argp-fmtstream.h,
hesiod/nss_hesiod/hesiod-proto.c,
intl/plural.c,
libio/iopopen.c,
nis/nss_nis/nis-hosts.c,
nss/nss_files/files-hosts.c,
nss/nss_files/files-network.c,
nss/nss_files/files-proto.c,
nss/nss_files/files-rpc.c,
nss/nss_files/files-service.c,
resolv/arpa/nameser_compat.h,
stdlib/gmp-impl.h,
string/strcoll_l.c,
sysdeps/unix/sysv/linux/clock_getres.c,
sysdeps/unix/sysv/linux/clock_gettime.c</samp></p>

<p><samp>elf/link.h</samp> has function-like macro definitions replaced to function-like macro names. For example,:</p>

<pre>
#define ELFW(type) _ElfW (ELF, __ELF_NATIVE_CLASS, type)
                                        /* sysdeps/generic/ldsodefs.h:46    */
#define _ElfW(e,w,t)    _ElfW_1 (e, w, _##t)            /* elf/link.h:32    */
#define _ElfW_1(e,w,t)  e##w##t                         /* elf/link.h:33    */
#define __ELF_NATIVE_CLASS __WORDSIZE               /* bits/elfclass.h:11   */
#define __WORDSIZE 32           /* sysdeps/wordsize-32/bits/wordsize.h:19   */
#define ELF32_ST_TYPE(val) ((val) & 0xf)                /* elf/elf.h:429    */
</pre>

<p>with the above macro definitions,</p>

<pre>
    && ELFW(ST_TYPE) (sym->st_info) != STT_TLS      /* elf/do-lookup.h:81   */
</pre>

<p>in this macro call, <samp><tt>ELFW</tt>(ST_TYPE)</samp> is expanded with the following steps:</p>

<pre>
    ELFW(ST_TYPE)
    _ElfW(ELF, __ELF_NATIVE_CLASS, ST_TYPE)
    _ElfW_1(ELF, 32, _ST_TYPE)
    ELF32_ST_TYPE
</pre>

<p>Then, <tt>ELF32_ST_TYPE</tt> with the subsequent sequence <samp>(sym->st_info)</samp> is expanded to <samp>((sym->st_info) & 0xf)</samp>.
That is to say, a function-like macro call of <samp><tt>_ElfW_1</tt>(ELF, 32, _ST_TYPE)</samp> is expanded to name of another function-like macro <tt>ELF32_ST_TYPE</tt>.</p>

<p>These macros become more clear, if the 3 definitions of above 6 are written as:</p>

<pre>
#define ELFW( type, val)        _ElfW( ELF, __ELF_NATIVE_CLASS, type, val)
#define _ElfW( e, w, t, val)    _ElfW_1( e, w, _##t, val)
#define _ElfW_1( e, w, t, val)  e##w##t( val)
</pre>

<p>and if they are used as:</p>

<pre>
    && ELFW(ST_TYPE, sym->st_info) != STT_TLS
</pre>

<p>Although these arguments may seem to be a little redundant, these are more natural than the original ones, if we think of function call syntax.</p>

<h4><a name="3.9.8.6">3.9.8.6. Macros Expanded to 'defined'</a></h4>

<p>The following files contain macro definitions whose replacement-lists have the <samp>'defined'</samp> token. *1</p>

<p><samp>iconv/skeleton.c,
sysdeps/generic/_G_config.h,
sysdeps/gnu/_G_config.h,
sysdeps/i386/dl-machine.h,
sysdeps/i386/i686/memset.S,
sysdeps/mach/hurd/_G_config.h,
sysdeps/posix/sysconf.c</samp></p>

<p>Those macros are used in some #if lines of the following files, and also some of the above files themselves.</p>

<p><samp>elf/dl-conflict.c,
elf/dl-runtime.c,
elf/dynamic-link.h</samp></p>

<p>In glibc 2.1.3, <samp>malloc/malloc.c</samp> had a macro definition of <tt>HAVE_MREMAP</tt> whose replacement-list contained the <samp>'defined'</samp> token.
In glibc 2.4, that macro definition has been revised to portable one, nevertheless the unportable macros of the same sort have increased in other source files.</p>
<p>In a #if expression, the result of a macro expansion whose replacement-list has the <samp>'defined'</samp> token is undefined according to the Standards, and it is only self-satisfaction of GCC to preprocess the expression plausibly and arbitrarily.
In order to make these sources portable among other preprocessors, at least the definitions of these macros should be rewritten, and in some cases the calls of the macros should be rewritten, too. *2</p>
<p>In most cases, the simple rewriting is sufficient as seen in <a href="#3.9.4.6">3.9.4.6</a>.
In some cases, however, this method does not work.
Those are the cases where evaluation result of <samp>'defined MACRO'</samp> differs depending on its timing.
For example, <samp>sysdeps/i386/dl-machine.h</samp> has the following macro definition, which is used in some #if expressions on other files.</p>

<pre>
#define ELF_MACHINE_NO_RELA defined RTLD_BOOTSTRAP
</pre>

<p>Rewriting the definition as follows will not do.</p>

<pre>
#if defined RTLD_BOOTSTRAP
#define ELF_MACHINE_NO_RELA 1
#endif
</pre>

<p>The macro <samp>RTLD_BOOTSTRAP</samp> is defined in <samp>elf/rtld.c</samp>, if and only that file is included before <samp>dl-machine.h</samp>.
In other words, the evaluation result of <samp>'defined RTLD_BOOTSTRAP'</samp> depends on the order of including the two files.
In order to rewrite these sources portable, the macro <tt>ELF_MACHINE_NO_RELA</tt> should be abandoned since it is useless macro found only in #if lines, and the #if line:</p>

<pre>
#if ELF_MACHINE_NO_RELA
</pre>

<p>should be rewritten as:</p>

<pre>
#if defined RTLD_BOOTSTRAP
</pre>

<p>In glibc, this portable style of #if lines are found on many places, at the same time, the undefined style as above example are also found on some places.

<p>Note:</p>
<p>*1 On Linux, /usr/include/_G_config.h is the header file installed from glibc's sysdeps/gnu/_G_config.h, therefore it has the same macro definition as:</p>
<pre>
#define _G_HAVE_ST_BLKSIZE defined (_STATBUF_ST_BLKSIZE)
</pre>
<p>This should be rewritten to:</p>
<pre>
#if defined (_STATBUF_ST_BLKSIZE)
#define _G_HAVE_ST_BLKSIZE 1
#endif
</pre>
<p>*2 <b>mcpp</b> V.2.7 and later in <i>STD</i> mode on GCC-specific-build handles '<samp>defined</samp>' token generated by macro expansion in #if line like GCC.
Yet, such a bug-to-bug handling should not be depended on.</p>

<h4>3.9.8.7. Preprocessing .S File</h4>

<p>*.S files are provided for each CPU type, so their number is very large and amounts to more than 1000.
The files for one CPU type as x86 are some portion of them.</p>
<p>*.S file is an assembler source with inserted preprocessing directives such as #if or #include, comments or macros of C.
Since assembler source is not consisted of C token sequence, it accompanies some risks to preprocess it by C preprocessor.
To process an assembler source, the preprocessor must pass such characters as % or $ (which are not used in C except in string literal or in character constant) as they are, and retain existence or non-existence of spaces as they are.
Furthermore, the preprocessor must relax syntax checking to pass a sequence which would be an error if it was in C source.
On the other hand, it must process #if lines or macros like C, and must do some sort of error checking, too.
What a nuisance!
These specifications have not any logical basis at all, these are GCC's local and mostly undocumented behaviors and no more.</p>

<p>To illustrate the problems, let me take an example of the following fragment from <samp>nptl/sysdeps/unix/sysv/linux/i386/i486/pthread_cond_wait.S</samp>.</p>

<pre>
    .byte   8               # Return address register
                            # column.
#ifdef SHARED
    .uleb128 7              # Augmentation value length.
    .byte   0x9b            # Personality: DW_EH_PE_pcrel
                            # + DW_EH_PE_sdata4
</pre>

<p><samp>'#ifdef SHARED'</samp> intends to be a directive of C.
On the other hand, the latter part of each line starting with # are supposed to be comments.
<samp>'# column.'</samp> is, however, syntactically indistinguishable from invalid directive, since the # is the first non-white-space-character of the line.
<samp>'# + DW_EH_PE_sdata4'</samp> causes even syntax error in C.<br>
Another file has the following line, where a single character appears singly.
In C, a pair of the single quote is used to quote a character constant, and unmatched single quote causes a tokenization error.</p> 

<pre>
    movl 12(%esp), %eax     # that `fixup' takes its parameters in regs.
</pre>

<p>The above pthread_cond_wait.S also has the following line which is a macro call.</p>

<pre>
versioned_symbol (libpthread, __pthread_cond_wait, pthread_cond_wait,
          GLIBC_2_3_2)
</pre>

<p>The macros are defined as:</p>

<pre>
# define versioned_symbol(lib, local, symbol, version) \
  versioned_symbol_1 (local, symbol, VERSION_##lib##_##version)
                                    /* include/shlib-compat.h:65    */
# define versioned_symbol_1(local, symbol, name) \
  default_symbol_version (local, symbol, name)
                                    /* include/shlib-compat.h:67    */
# define default_symbol_version(real, name, version) \
     _default_symbol_version(real, name, version)
                                    /* include/libc-symbols.h:398   */
#   define _default_symbol_version(real, name, version) \
     .symver real, name##@##@##version
                                    /* include/libc-symbols.h:411   */
#define VERSION_libpthread_GLIBC_2_3_2  GLIBC_2.3.2
                            /* Created by make: abi-versions.h:145  */
</pre>

<p>The line is expected to be expanded as:</p>

<pre>
.symver __pthread_cond_wait, pthread_cond_wait@@GLIBC_2.3.2
</pre>

<p>The problem is the definition of <tt>_default_symbol_version</tt>.
There is no C token containing '@' (except string-literal or character-constant).
Though <samp>pthread_cond_wait@@GLIBC_2.3.2</samp> is a sequence generated by concatenating some parts with <samp>##</samp> operator, this is not a C token.
The concatenation generates illegal tokens also in midst of its processing.
The macro uses <samp>##</samp> operator of C, nevertheless its syntax is far from C.</p>
<p>In order to do a sort of preprocessing on an assembler source, essentially an assembler macro processor should be used.
To process assembler codes with C, it is recommended that the asm() or __asm__() function should be used whenever possible, to embed the assembler code in a string literal, and that not *.S but *.c should be used as a file name.<br>
<samp>libc-symbols.h</samp> has another version of the above macro as follows which is used for *.c.
This macro can be processed by Standard-conforming C preprocessor without problem.</p>

<pre>
#   define _default_symbol_version(real, name, version) \
     __asm__ (".symver " #real "," #name "@@" #version)
</pre>

<p>glibc also has many *.c or *.h files which use asm() or __asm()__.
Nevertheless, it has much more *.S files.</p>
<p>If you process an assembler source by C preprocessor in any way, at least you should use /* */ or // as comment notation instead of #.
In actual, many sources of glibc use /* */ or //, whereas some sources use #.</p>
<p>Having said so, <b>mcpp</b> V.2.6.3 onward relaxed grammar checking largely in lang-asm mode to process these unusual sources, considering that glibc 2.4 has too many *.S files and out-of-C-grammar-sources has increased since 2.1.3.</p>

<h4>3.9.8.8. Problems of versions.awk, rpcgen and -dM Option</a></h4>

<p>The problem of <samp>stdlib/isomac.c</samp> which I referred to at <a href="#3.9.4.8">3.9.4.8</a> is the same in glibc 2.4.</p>
<p>Also the problem of rpcgen is unchanged.</p>
<p>In addition, glibc 2.4 has <samp>scripts/versions.awk</samp> file, which presupposes GCC's peculiar behavior about the number of line top spaces of preprocessed output.
In order to use <b>mcpp</b> or other preprocessors, this file should be revised as follows.</p>

<pre>
$ diff -c versions.awk*
*** versions.awk        2006-12-13 00:59:56.000000000 +0900
--- versions.awk.orig   2005-03-23 10:46:29.000000000 +0900
***************
*** 50,56 ****
  }

  # This matches the beginning of a new version for the current library.
! /^ *[A-Z]/ {
    if (renamed[actlib "::" $1])
      actver = renamed[actlib "::" $1];
    else if (!versions[actlib "::" $1] && $1 != "GLIBC_PRIVATE") {
--- 50,56 ----
  }

  # This matches the beginning of a new version for the current library.
! /^  [A-Za-z_]/ {
    if (renamed[actlib "::" $1])
      actver = renamed[actlib "::" $1];
    else if (!versions[actlib "::" $1] && $1 != "GLIBC_PRIVATE") {
***************
*** 65,71 ****
  # This matches lines with names to be added to the current version in the
  # current library.  This is the only place where we print something to
  # the intermediate file.
! /^ *[a-z_]/ {
    sortver=actver
    # Ensure GLIBC_ versions come always first
    sub(/^GLIBC_/," GLIBC_",sortver)
--- 65,71 ----
  # This matches lines with names to be added to the current version in the
  # current library.  This is the only place where we print something to
  # the intermediate file.
! /^   / {
    sortver=actver
    # Ensure GLIBC_ versions come always first
    sub(/^GLIBC_/," GLIBC_",sortver)
</pre>

<h4>3.9.8.9. -include, -isystem, -I- Options</h4>

<p><samp>-isystem</samp> and <samp>-I-</samp> options are no longer used.</p>
<p>On the other hand, <samp>-include</samp> option is used extremely frequently.
A header file <samp>include/libc-symbols.h</samp> is included by this option as many as 7000 times.
This <samp>-include</samp> is an option to push out a <samp>#include</samp> line from source to makefile.
It makes source incomplete, and is not recommendable.</p>

<h4>3.9.8.10. Undocumented Predefined Macros</h4>

<p>This is not a problem of glibc but of GCC.
While a few important predefined macros were undocumented in GCC 2, they got documented in GCC 3.
On the other hand, GCC 3.3 and later predefines many macros, and most of them are undocumented.</p>

<h4>3.9.8.11. Other Problems</h4>

<p><samp>debug/tst-chk1.c</samp> has a queer part which is not processed as its intension by other preprocessor than GCC, unless revised as follows.</p>

<pre>
$ diff -cw tst-chk1.c*
*** tst-chk1.c  2007-01-11 00:31:45.000000000 +0900
--- tst-chk1.c.orig     2005-08-23 00:12:34.000000000 +0900
***************
*** 113,119 ****
  static int
  do_test (void)
  {
-   int   arg;
    struct sigaction sa;
    sa.sa_handler = handler;
    sa.sa_flags = 0;
--- 113,118 ----
***************
*** 135,146 ****
    struct A { char buf1[9]; char buf2[1]; } a;
    struct wA { wchar_t buf1[9]; wchar_t buf2[1]; } wa;

  #ifdef __USE_FORTIFY_LEVEL
!   arg = (int) __USE_FORTIFY_LEVEL;
  #else
!   arg = 0;
  #endif
!   printf ("Test checking routines at fortify level %d\n", arg);

    /* These ops can be done without runtime checking of object size.  */
    memcpy (buf, "abcdefghij", 10);
--- 134,146 ----
    struct A { char buf1[9]; char buf2[1]; } a;
    struct wA { wchar_t buf1[9]; wchar_t buf2[1]; } wa;

+   printf ("Test checking routines at fortify level %d\n",
  #ifdef __USE_FORTIFY_LEVEL
!         (int) __USE_FORTIFY_LEVEL
  #else
!         0
  #endif
!         );

    /* These ops can be done without runtime checking of object size.  */
    memcpy (buf, "abcdefghij", 10);
</pre>

<p>Contrary to its innocent looking, the original source defines printf() as a macro, and as its consequence, <samp>#ifdef</samp> and other directive-like lines are usually eaten as an argument of the macro call.
According to the Standards, the result is undefined when there is a line in an argument of a macro which would otherwise act as a directive.
Since directive processing and macro expansion should be done in the same translation phase, it is an arbitrariness of GCC to process directive first.
In the first place, processing of <samp>#ifdef __USE_FORTIFY_LEVEL</samp> line also contains macro processing, therefore it is extremely arbitrary to process this line and the other directive-like lines first then expand printf() macro.
C preprocessing should be done sequentially from the top.</p>
<p>The configure script of glibc also has a portion to use GCC's peculiar help message.
The script searches help message of compiler for <samp>"-z relro"</samp> option.
If you use <b>mcpp</b> as a preprocessor, this portion does not yield the expected result.
In spite of this problem, fortunately, compiling and test of glibc is done normally.</p>
<p>By the way, while GCC up to 3.2 appended many useless -A options by default on its invocation, GCC 3.3 onward ceased to do it.</p>

<h4>3.9.8.12. Increasing Dependency on GCC</h4>

<p>Most of the portability problems I had found in glibc 2.1.3 have not been cleared in glibc 2.4 the six years newer version.
On the contrary, number of sources lacking portability has increased.</p>
<p>There have been a few improvements such as disappearance of multi-line-string-literal, <samp>-isystem, -I-</samp> options and <samp>-A</samp> options on GCC side.</p>
<p>Meanwhile, sources with such unportable features have largely increased as #include_next, variadic macro of GCC2 spec, its call without variable argument, macro definition with 'defined' token in its replacement-list, *.S file and <samp>-include</samp> option.
Macro calls with an empty argument have also increased.
Above all it is most annoying that the writings which do not correspond to Standard C one-to-one, and hence cannot be easily converted to portable one, have increased.</p>
<p>All of these are problems of dependency on GCC's local specification and undocumented behavior.
In a large scale software as glibc, once such unportable sources are created, it becomes difficult to revise them because many source files are interrelated.
As a consequence, the same writings tend to be inherited for years, and even new sources are written so as to suit the old interfaces.
For example, it shows this relationship directly that only the variadic macros of GCC2 spec are used, and neither of C99 spec nor GCC3 spec are not used at all.
Besides, even if some unportable parts in a few sources are revised, at the same time the old unportable codings often appear newly in other sources.
The old style writings are not easily cleared.</p>
<p>On the other hand, change of GCC behavior breaks many sources, and the possible influence becomes greater with time, therefore GCC becomes difficult to change its behavior.
I think that both of GCC and glibc need to tidy up their old local specifications and old interfaces drastically in the near future.</p>

<h3><a name="3.9.9" href="#toc.3.9.9">3.9.9. The Problems of Linux / stddef.h, limits.h and #include_next</a></h3>
<p>On Linux, the system compiler is GCC, and the standard library is glibc.
In these circumstances, there are some system headers which presuppose only GCC.
Those are the obstacles to use other compiling tools than GCC such as <b>mcpp</b> of compiler-independent-build.
For example, <samp>stddef.h</samp> and some other Standard header files are located only in GCC's version specific include directory, and are not found in <samp>/usr/include</samp>.
These are rude deficiencies of the system header structure, and <b>mcpp</b> needs some workarounds for them.</p>
<p>On Linux, GCC installs a version specific include directory such as <samp>/usr/lib/gcc-lib/SYSTEM/VERSION/include</samp> where the Standard headers stddef.h, limits.h and some others are located.  These headers and GCC behavior on them are queer.  The problems are the same on CygWIN as on Linux.  Mac OS X also has a few problems on some Standard headers.</p>

<h4>3.9.9.1. /usr/include Lacks Standard Headers</h4>
<p>In the first place, on Linux, five of the Standard C header files <samp>float.h, iso646.h, stdarg.h, stdbool.h, stddef.h</samp> are located only in the GCC version specific directory, not in <samp>/usr/include</samp> nor <samp>/usr/local/include</samp>.  The system headers on Linux seem to more or less intend that compiler systems other than GCC use only <samp>/usr/include</samp> and GCC uses its version specific directory in addition to <samp>/usr/include</samp>.  In fact, /usr/include lacks some Standard headers, that is the problem for non-GCC compilers or preprocessors.</p>
<p>If non-GCC preprocessor uses also GCC version specific directory, then on limits.h in this directory, the preprocessor encounters #include_next which is a GCC specific directive.  If that is the case, why doesn't the preprocessor implement #include_next?  Then, the limits.h causes a problem, because it is not cleanly written.  What is worse, GCC V.3.3 or later predefines practically by itself the macros to be defined by limits.h, hence the header is useless for other preprocessors.</p>
<p>Besides, as for GCC itself, it shows queer behavior with #include_next in this header.</p>
<p>Although these problems are complicated to explain, I will describe them here, because they have been neglected for years for some reason.</p>
<p>Note that only <b>mcpp</b> of compiler-independent-build suffers this problem.
GCC-specific-build is not affected.

<h4>3.9.9.2. Queer Handling of #include_next</h4>
<p>The include directories for GCC are typically set as:</p>
<pre>
/usr/local/include
/usr/lib/gcc-lib/SYSTEM/VERSION/include
/usr/include
</pre>
<p>These are searched from upper to lower.  The second is the GCC specific include directory.  SYSTEM is i386-vine-linux, i368-redhat-linux or such, VERSION is 3.3.2, 3.4.3 or such.  If you install another version of GCC into <samp>/usr/local</samp>, the <samp>/usr/lib/gcc-lib</samp> part above will become <samp>/usr/local/lib/gcc</samp>.  In C++, some other directories are set with higher priority than <samp>/usr/local/include</samp>.  For GCC V.3.* and 4.*, those are:</p>
<pre>
/usr/include/c++/VERSION
/usr/include/c++/VERSION/SYSTEM
/usr/include/c++/VERSION/backward
</pre>
<p>The name of these directories seem GCC specific, nevertheless no other C++ standard directories do not exist, so the other preprocessors can use no directories but these.  For GCC 2.95, the include directory in C++ was:</p>
<pre>
/usr/include/g++-3
</pre>
<p>In addition, the directories specified by -I option or by environment variables are prepended to the list.</p>
<p>Let me take an example of limits.h in C on GCC V.3.3 or later focusing on definition of <tt>LONG_MAX</tt>, in order to make the explanations below simple.  There are two limits.h: one in <samp>/usr/include</samp> and another in the version specific directory.</p>
<pre>
#include &lt;limits.h&gt;
</pre>
<p>By this line, GCC includes <samp>/usr/lib/gcc-lib/SYSTEM/VERSION/include/limits.h</samp>.  This header file starts as:</p>
<pre>
#ifndef _GCC_LIMITS_H_
#define _GCC_LIMITS_H_
#ifndef _LIBC_LIMITS_H_
#include "syslimits.h"
#endif
</pre>
<p>Then, GCC includes <samp>/usr/lib/gcc-lib/SYSTEM/VERSION/include/syslimits.h</samp> which is a short file as:</p>
<pre>
#define _GCC_NEXT_LIMITS_H
#include_next &lt;limits.h&gt;
#undef _GCC_NEXT_LIMITS_H
</pre>
<p>Now, limits.h is included again.  Which limits.h?  Since this directive is #include_next, it would skip the <samp>/usr/lib/gcc-lib/SYSTEM/VERSION/include</samp>, and would search <samp>/usr/include</samp>.  GCC's cpp.info says:</p>
<blockquote>
<p>This directive works like `#include' except in searching for the specified file: it starts searching the list of header file directories _after_ the directory in which the current file was found.</p>
</blockquote>
<p>In fact, however, GCC does not include <samp>/usr/include/limits.h</samp>, but includes <samp>/usr/lib/gcc-lib/SYSTEM/VERSION/include/limits.h</samp> again somehow.<br>
This time <tt>_GCC_LIMITS_H_</tt> has been defined already, so the block beginning with the line:</p>
<pre>
#ifndef _GCC_LIMITS_H_
</pre>
<p>is skipped, and the next block is evaluated:</p>
<pre>
#else
#ifdef _GCC_NEXT_LIMITS_H
#include_next &lt;limits.h&gt;
#endif
#endif
</pre>
<p>Again, just the same #include_next &lt;limits.h&gt; which were found in <samp>/usr/lib/gcc-lib/SYSTEM/VERSION/include/syslimits.h</samp>.  Does GCC include <samp>/usr/lib/gcc-lib/SYSTEM/VERSION/include/limits.h</samp> again as the previous time, which is the current file, and run into infinite recursion?  No, it does not, but it includes <samp>/usr/include/limits.h</samp> this time.  The behavior of GCC is beyond my understanding.</p>
<p>In <samp>/usr/include/limits.h</samp>, &lt;features.h&gt; and some other headers are included.  Also, <samp>/usr/include/limits.h</samp> has a block beginning with the line:</p>
<pre>
#if !defined __GNUC__ || __GNUC__ &lt; 2
</pre>
<p>In this block, &lt;bits/wordsize.h&gt; is included, and the Standard required macros are defined depending whether wordsize is 32 bit or 64 bit.  For example, if wordsize is 32 bit, <tt>LONG_MAX</tt> is defined as:</p>
<pre>
#define LONG_MAX     2147483647L
</pre>
<p>Of course, GCC skips this block.  Then, going to the end of this file, it returns to <samp>/usr/lib/gcc-lib/SYSTEM/VERSION/include/limits.h</samp>.  Then, ending this file of the second inclusion, it returns to <samp>/usr/lib/gcc-lib/SYSTEM/VERSION/include/syslimits.h</samp>.  Then, this file ends too, and GCC returns to the first inclusion of <samp>/usr/lib/gcc-lib/SYSTEM/VERSION/include/limits.h</samp>.  In this file, after the above cited part, there are definitions of the Standard required macros.  For instance, <tt>LONG_MAX</tt> is defined as:</p>
<pre>
#undef LONG_MAX
#define LONG_MAX __LONG_MAX__
</pre>
<p>Then, the file ends.</p>
<pre>
#include &lt;limits.h&gt;
</pre>
<p>The processing of this line has ended.  After all, <tt>LONG_MAX</tt> is defined to <tt>__LONG_MAX__</tt> and it is the end.  What is <tt>__LONG_MAX__</tt>?  As a matter of fact, GCC V.3.3 or later predefines many macros including <tt>__LONG_MAX__</tt> which is predefined to 2147483647L for 32 bit system.  As with the other Standard required macros, the situations are almost the same as <tt>LONG_MAX</tt>, because they are defined using the predefined ones.  If so, what is the purpose of these complicated header files and #include_next handling at all?</p>
<p>The behavior of GCC V.2.95, V.3.2, V.3.4, V.4.0 and V.4.1 on #include_next is the same as V.3.3.  That is to say:</p>
<pre>
#include_next &lt;limits.h&gt;
</pre>
<p>by this line in <samp>/usr/lib/gcc-lib/SYSTEM/VERSION/include/syslimits.h</samp>, GCC includes <samp>/usr/lib/gcc-lib/SYSTEM/VERSION/include/limits.h</samp>, and by the same line in this file:</p>
<pre>
#include_next &lt;limits.h&gt;
</pre>
<p>it includes <samp>/usr/include/limits.h</samp>.  As a result, in processing the line:</p>
<pre>
#include &lt;limits.h&gt;
</pre>
<p><samp>/usr/lib/gcc-lib/SYSTEM/VERSION/include/limits.h</samp> is included twice.  This duplicate inclusion happens to produce the same result, nevertheless it is redundant, and first of all, the behavior differs from the specification and is not consistent.  In addition, this part of the file is redundant if the behavior accords to the specification.</p>
<pre>
#else
#ifdef _GCC_NEXT_LIMITS_H
#include_next &lt;limits.h&gt;
#endif
</pre>
<h4>3.9.9.3. Standard Headers not Available for Preprocessors other than GCC</h4>
<p>Now, what happens to compiler or preprocessor other than GCC using Linux standard headers?  stddef.h and some other Standard headers are not found in <samp>/usr/include</samp> nor <samp>/usr/local/include</samp>.  If so, how about using also GCC version specific directory?</p>
<pre>
#include &lt;limits.h&gt;
</pre>
<p>By this line, the preprocessor includes <samp>/usr/lib/gcc-lib/SYSTEM/VERSION/include/limits.h</samp>, and from this file it includes <samp>/usr/lib/gcc-lib/SYSTEM/VERSION/include/syslimits.h</samp>, and in this file, it sees the line:</p>
<pre>
#include_next &lt;limits.h&gt;
</pre>
<p>Then, how about implementing #include_next?  If the #include_next is implemented as its specification, the preprocessor searches by this line the "next" include directory <samp>/usr/include</samp>, and includes <samp>/usr/include/limits.h</samp>.  Then, this non-GCC preprocessor processes the block beginning with this line:</p>
<pre>
#if !defined __GNUC__ || __GNUC__ &lt; 2
</pre>
<p>In this block it defines <tt>LONG_MAX</tt> as:</p>
<pre>
#define LONG_MAX     2147483647L
</pre>
<p>and defines also the other macros appropriately.  Then, it ends this file, and returns to <samp>/usr/lib/gcc-lib/SYSTEM/VERSION/include/syslimits.h</samp>.  Then, it ends this file, and returns to <samp>/usr/lib/gcc-lib/SYSTEM/VERSION/include/limits.h</samp>.  And it encounters these lines:</p>
<pre>
#undef LONG_MAX
#define LONG_MAX __LONG_MAX__
</pre>
<p>At the end of the long run, all the correct definitions are canceled, and they become the undefined name <samp>__LONG_MAX__</samp> or such!</p>
<p>Up to GCC V.3.2, the corresponding part of version specific limits.h had the lines like:</p>
<pre>
#define __LONG_MAX__ 2147483647L
</pre>
<p>Hence, the canceled macros are redefined correctly.  Although the most part of the processing is useless, the results were correct.  With the header files of V.3.3 or later, a non-GCC preprocessor is taken around here and there to get vain results.</p>

<h4>3.9.9.4. Workarounds for the Present</h4>
<p>The problems are summarized as below: *1, *2, *3, *4</p>
<ol>
<li><samp>/usr/include</samp> lacks Standard C headers <samp>float.h, iso646.h, stdarg.h, stdbool.h</samp> and <samp>stddef.h</samp> which are necessary to make Linux system headers usable to non-GCC compiler system.<br>
<li>C++ include directories do not exist other than <samp>/usr/include/c++/VERSION/*</samp>.  In order to make C++ standard include directories independent on GCC version, <samp>/usr/include/c++</samp> should be used instead of <samp>/usr/include/c++/VERSION</samp> which should be limited to GCC specific headers.
Though, having said that, this is a difficult task, since the C++ standard library is libstdc++ distributed with GCC on all of FreeBSD, Linux and Mac OS X.<br>
<li>The behavior of GCC on #include_next differs from its specification and is inconsistent.<br>
<li>It is meaningless to process the complicated limits.h headers, since GCC predefines the Standard required macros by itself in effect.  It is doubly meaningless, since <samp>/usr/lib/gcc-lib/SYSTEM/VERSION/include/limits.h</samp> does #undef all.  As far as Linux and CygWIN are concerned, there seems to be no necessity for splitting limits.h to two.  Since these headers in this directory are auto-generated ones by GCC installation, some redundancies are inevitable.  Yet, these are too dirty to install as system headers.<br>
</ol>
<p>Under these problems lies the excessively complicated system header structure.  The extension directive #include_next enhances the complication.  The use of this directive is very limited.  Though GCC and glibc use it in compiling and installing of themselves, it does not exist in the installed system headers except for limits.h.  The rare example in limits.h causes GCC above mentioned confusion.  This presents a question on the reason of its existence.</p>
<p>Anyway, the compiler-independent-build of <b>mcpp</b> needs the following workarounds for the present.  In order to avoid confusion, the compiler-independent-build does not implement #include_next nor uses GCC specific include directories.</p>
<ol>
<li>Link <samp>/usr/include/stddef.h</samp> to <samp>/usr/lib/gcc-lib/SYSTEM/VERSION/include/stddef.h</samp>.  In case of multiple versions of GCC have been installed, a link to any of them will work for mere preprocessing.  This setting does no harm on GCC nor GCC-specific-build of <b>mcpp</b>.  The same can be said about stdarg.h, though it expands macros to GCC built-in functions.<br>
<li>Copy or move iso646.h and stdbool.h from one of the GCC specific directories to <samp>/usr/include</samp>, since these are quite simple headers and independent on any system.  As for limits.h, the existing <samp>/usr/include/limits.h</samp> is enough for non-GCC preprocessor.<br>
<li>float.h is useless for other preprocessor, such as <tt>DBL_MAX_EXP</tt> is defined to <samp>__DBL_MAX_EXP__</samp>.  If required, you must write the header referring to the internal setting of GCC or some other source. *5<br>
<li>Do not set GCC specific include directory in C include directories list by environment variable.<br>
<li>Set C++ include directories by environment variable CPLUS_INCLUDE as <samp>/usr/include/c++/VERSION:/usr/include/c++/VERSION/SYSTEM:/usr/include/c++/VERSION/backward</samp>.<br>
</ol>
<p>For the GCC-specific-build of <b>mcpp</b>, no special setting is required, because it has GCC specific include directories list, implements #include_next as its specification, and predefines the macros as GCC does.</p>
<p>Note:</p>
<p>*1 I have checked the descriptions of this 3.9.9 section on Linux / GCC 2.95.3, 3.2, 3.3.2, 3.4.3, 4.0.2, 4.1.1, 4.3.0 and on CygWIN / GCC 2.95.3, 3.4.4.  As with CygWIN, the behavior on #include_next was as its specification on GCC 2.95.3, but on 3.4.4 it changed to the same behavior as Linux.  The C++ include directories in CygWIN was <samp>/usr/include/g++-3</samp> on 2.95.3, while they are <samp>/usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++</samp> and its sub-directories on 3.4.4.</p>
<p>*2 On FreeBSD 6.2 or 6.3 and its bundled GCC 3.4.6, all the Standard C headers are present in <samp>/usr/include</samp>, #include_next is not used in any system headers, and GCC specific C include directory does not exist.  However, C++ include directories are GCC version dependent as <samp>/usr/include/c++/3.4, /usr/include/c++/3.4/backward</samp>.<br>
Even on FreeBSD, an installation of another version of GCC makes GCC-version-specific include directory.  Most of the headers in the directory are redundant.  However, the headers in <samp>/usr/include</samp> remain unchanged.</p>
<p>*3 On Mac OS X Leopard / Apple-GCC 4.0.1, as on Linux, there is a GCC-version-specific include directory, #include_next is used in <samp>limits.h</samp> and a few other headers, also two <samp>limits.h</samp> are found.
However, #include_next in <samp>syslimits.h</samp> has been deleted by Apple.
<samp>float.h, iso646.h, stdarg.h, stdbool.h, stddef.h</samp> are all found in <samp>/usr/include</samp>, hence so much special settings are not necessary for <b>mcpp</b>.
But, <samp>float.h, stdarg.h</samp> are only for GCC and Metrowerks (for powerpc), so if you use them with <b>mcpp</b>, you must rewrite <samp>float.h</samp> yourself and make <samp>stdarg.h</samp> to include GCC-version-specific one.
Note that some definitions in <samp>float.h</samp> are different between x86 and powerpc.</p>
<p>*4 On MinGW / GCC 3.4.*, though the include directories and their precedence differ from the other systems, the behavior of GCC on #include_next is the same, and also some Standard headers are not in the standard include directory <samp>/mingw/include</samp> but in its version-specific-directory.</p>
<p>*5 float.h for i386 system can be written as follows referring to GCC's setting:</p>
<pre>
/* float.h  */

#ifndef _FLOAT_H___
#define _FLOAT_H___

#define FLT_ROUNDS      1
#define FLT_RADIX       2

#define FLT_MANT_DIG    24
#define DBL_MANT_DIG    53
#define LDBL_MANT_DIG   64

#define FLT_DIG         6
#define DBL_DIG         15
#define LDBL_DIG        18

#define FLT_MIN_EXP     (-125)
#define DBL_MIN_EXP     (-1021)
#define LDBL_MIN_EXP    (-16381)

#define FLT_MIN_10_EXP  (-37)
#define DBL_MIN_10_EXP  (-307)
#define LDBL_MIN_10_EXP (-4931)

#define FLT_MAX_EXP     128
#define DBL_MAX_EXP     1024
#define LDBL_MAX_EXP    16384

#define FLT_MAX_10_EXP  38
#define DBL_MAX_10_EXP  308
#define LDBL_MAX_10_EXP 4932

#define FLT_MAX         3.40282347e+38F
#define DBL_MAX         1.7976931348623157e+308
#define LDBL_MAX        1.18973149535723176502e+4932L

#define FLT_EPSILON     1.19209290e-7F
#define DBL_EPSILON     2.2204460492503131e-16
#define LDBL_EPSILON    1.08420217248550443401e-19L

#define FLT_MIN         1.17549435e-38F
#define DBL_MIN         2.2250738585072014e-308
#define LDBL_MIN        3.36210314311209350626e-4932L

#if defined (__STDC_VERSION__) &amp;&amp; __STDC_VERSION__ &gt;= 199901L
#define FLT_EVAL_METHOD 2
#define DECIMAL_DIG     21
#endif /* C99 */

#endif /* _FLOAT_H___ */
</pre>
<br>

<h3><a name="3.9.10" href="#toc.3.9.10">3.9.10. Problems of Mac OS X / Apple-GCC and its System Headers</a></h3>
<p>On V.2.7, <b>mcpp</b> began to support Max OS X / GCC.
This section describes the problems of the system found by <b>mcpp</b>.
The author, however, does not know the system so much yet.
He has only compiled <b>mcpp</b> itself and firefox on the system.
He knows nothing about Objective C nor Objective C++.</p>
<p>Since GCC is practically the only compiler on this system now, some dependencies on GCC-local specs are found in some of its system headers.
Such dependencies are not so much as Linux, maybe because its standard library is not glibc.
But, they are not so few as FreeBSD.
Some tidy-ups are desirable.</p>
<p>Another characteristic of this system is that the system compiler is a GCC largely modified and extended by Apple.
In the system headers and some sources of Apple on Max OS X, dependencies on Apple-GCC-local specs are more conspicuous than those on general-GCC-local specs.
In particular, the extended specs to support both of Intel-Mac and PowerPC-Mac on a machine are the most characteristic.</p>
<p>Here, we refer to the system of Mac OS X Leopard / Apple-GCC 4.0.1.</p>

<h4>3.9.10.1. #include_next, #warning</h4>
<p>The GCC-local directive #include_next are not many, but found in <samp>float.h, stdarg.h, varargs.h</samp> in <samp>/usr/include/</samp> and the files of the same name in <samp>/Developer/SDKs/MacOSX10.*.sdk/usr/include/</samp>.
All of them are to include different real header files depending on whether the compiler is GCC or Metrowerks.
When the compiler is GCC, <samp>stdarg.h</samp>, for example, does '#include_next &lt;stdarg.h&gt;'.<br>
The <samp>limits.h</samp> in GCC-version-specific include directory has #include_next as Linux, but the one in <samp>syslimits.h</samp> has been removed and a bit tidied up.</p>
<p>Though this directive is used modestly, it is a problem that <samp>float.h, stdarg.h</samp> presuppose only GCC and Metrowerks.
Those can be written more portable as on FreeBSD. *1<br>
In addition, #include_next for GCC on a header in <samp>/usr/include</samp> is a nonsense, because the priority of that include directory is lower than GCC-version-specific one.
Consequently this #include_next is never executed.</p>
<p>Another GCC-local directive #warning is sometimes found in <samp>objc/, wx-2.8/wx/</samp> and a few other directories in <samp>/usr/include/</samp>, and their corresponding directories in <samp>/Developer/SDKs/MacOSX*.sdk/usr/include/</samp>.
Most of the directives are warnings against obsolete or deprecated files or usages.<br>
<samp>backward_warning.h</samp> in <samp>/usr/include/c++/VERSION/backward/</samp> and its corresponding file in <samp>/Developer/SDKs/MacOSX*.sdk/</samp> are to execute #warning against these deprecated headers.
And all the headers in the directories include this header.
This is the same with Linux or FreeBSD.</p>
<p>Note:</p>
<p>*1 About how to use these headers with compiler-independent-build of <b>mcpp</b>, refer 3.9.9.4 and its note 3.</p>

<h4>3.9.10.2. A Macro Expanded to 'defined'</h4>
<p><samp>/usr/include/sys/cdefs.h</samp> and its corresponding file of the same name in <samp>/Developer/SDKs/MacOSX*.sdk/</samp> have a macro definition as:</p>
<pre>
#define __DARWIN_NO_LONG_LONG   (defined(__STRICT_ANSI__) \
                && (__STDC_VERSION__-0 < 199901L) \
                && !defined(__GNUG__))
</pre>
<p>And it is used in <samp>stdlib.h</samp> and a few others as:
<pre>
#if __DARWIN_NO_LONG_LONG
</pre>
<p>This macro should be defined as: *1</p>
<pre>
#if     defined(__STRICT_ANSI__) \
                && (__STDC_VERSION__-0 < 199901L) \
                && !defined(__GNUG__)
#define __DARWIN_NO_LONG_LONG   1
#endif
</pre>
<p>Note:</p>
<p>*1 As for its reason, see <a href="#3.9.4.6">3.9.4.6</a> and <a href="#3.9.8.6">3.9.8.6</a>.</p>

<h4>3.9.10.3. Tokens in #endif line</h4>
<p><samp>gssapi.h, krb5.h, profile.h</samp> in <samp>/System/Library/Frameworks/Kerberos.framework/Headers</samp> have queer #endif lines like:</p>
<pre>
#endif \* __KERBEROS5__ */
</pre>
<p>This <samp>\* __KERBEROS5__ */</samp> seems to intend to be a comment.
I cannot understand why they must invent such a writing.
Though GCC usually warns at it, Apple-GCC does not issue any warning even if <samp>-pedantic</samp> or any other options are specified.
Apple-GCC does not warn at the following case, too.
It still trails a sense of pre-C90.</p>
<pre>
#endif __KERBEROS5__
</pre>

<h4>3.9.10.4. Some Special Usages of Macro</h4>
<p>As far as a compilation of firefox 3.0b3pre source is concerned, any of the following special usages of macro, which are frequently found in glibc sources and Linux system headers, is not found in the Mac OS X system headers included from the firefox source.</p>
<ul>
<li>GCC-specific variable argument macro.<br>
While a few variadics of C99-spec are found, no GCC-specific one is found.
<li>Empty argument of macro call.
<li>Object-like macro replaced with a function-like macro name.
</ul>

<h4>3.9.10.5. Apple-GCC's Peculiar Specifications</h4>
<p>Apple-GCC has some peculiar specifications different from the general GCC.</p>
<ul>
<li><p>Specs to generate binaries for both of Intel-Mac and PowerPC-Mac on either machine</p>
<p>Mac OS X has a pair of compilers for x86 and ppc.
(One is a native compiler, and other is a cross compiler.)
This pair of Apple-GCCs have their own option <samp>-arch</samp>.
If you specify multiple CPUs as '<samp>-arch i386 -arch ppc</samp>', gcc will be repeatedly invoked, binaries for the specified CPUs will be generated, and a "universal binary" which bundles all the binaries will be created.
Also they have another peculiar option <samp>-mmacosx-version-min=</samp>.
You can use this option along with <samp>-isysroot</samp> or <samp>--sysroot</samp> option, and widen the range of compatibility of the binary to the older versions of Mac OS X to some extent.
These specs are convenient to make a binary package for Mac OS X.</p>
<p>As for preprocessing, you should remember that some predefined macros differ depending on the CPU specified.</p>
<li><p>"framework" directories</p>
<p>Mac OS X has "framework" directories inherited from NeXTstep.
Framework is a hierarchical directory that contains shared resources such as header files, library, documents, and some other resources.
To include a header file in these directories, such a directive is used as:</p>
</p>
<pre>
#include &lt;Kerberos/Kerberos.h&gt;
</pre>
<p>This format is the same with:</p>
<pre>
#include &lt;sys/stat.h&gt;
</pre>
<p>However, these two have quite different meanings.
While the latter includes the file  <samp>sys/stat.h</samp> in some include directory (in this case <samp>/usr/include</samp>), <samp>&lt;Kerberos/Kerberos.h&gt;</samp> is not a path-list, and <samp>Kerberos</samp> is not even a directory name.
This is a file <samp>Kerberos.framework/Headers/Kerberos.h</samp> in a framework directory <samp>/System/Library/Frameworks</samp>.
And in actual, <samp>Kerberos.framework/Headers</samp> is a symbolic-link to <samp>Kerberos.framework/Versions/Current/Headers</samp>.
This is the most simple case of framework header file location.
There are many other far more complex cases.</p>
<p>Who has invented such a complex system?
This system burdens a preprocessor, because it needs to search system headers in framework directories repeatedly building and rebuilding path-list.
Some headers further include many other headers.</p>
<li><p>"header map" file</p>
<p>Xcode.app is an IDE on Max OS X.
It uses "header map" file, which is a list of include files.
One of the tools of Xcode checks source files, searches the files to be included, and records the path-list of the files into a file named *.hmap, and Apple-GCC refers to it instead of include directories.
This is an extended feature of Apple-GCC.</p>
<p>Header map file is a device to lessen burdens of header file searching.
However, it is a binary file of a peculiar specification and lacks transparency.
In order to lessen heavy burdens of framework header searching, it is more desired to reorganize the framework system.</p>
<li><p>Tokens in #endif line</p>
<p>As shown in the previous section, Apple-GCC does not issue even a warning whatever junks are on a #endif line, regardless of whatever options specified.
It is quite an anachronism.</p>
<li><p>Non-ASCII characters in comments</p>
<p>This is not a problem of GCC, but a problem of system headers in framework directory.
In many headers, some non-ASCII characters are frequently found in comments, such as the copyright mark (0xA9) and others of ISO-8859-* (?).
They are nuisances on an environment of multibyte characters, even if in comments.
A little bit of character encoding consciousness is desired.
Though the characters of this kind are sometimes found also in /usr/include of Linux, they are far more often found in framework headers of Mac OS.</ul>

<h3><a name="3.9.11" href="#toc.3.9.11">3.9.11. Preprocessing firefox 3.0b3pre</a></h3>
<p>I compiled source of firefox developing version 3.0b3pre (January, 2008), or 3.0-beta3-prerelease, by GCC replacing its preprocessor with <b>mcpp</b> V.2.7 on Linux/x86 + GCC 4.1.2 and Mac OS X + GCC 4.0.1.
<b>mcpp</b> were executed with -Kv option passing its output to cc1 (cc1plus).
As its results, the compilations successfully completed, and the firefox binaries were generated. *1</p>
<p>The preprocessing portability of firefox source on the whole is rather high.
The dependencies on GCC-local specifications, such as frequently found in glibc sources, are not found so many.
It is portable enough to officially support both of GCC on Linux, Mac OS X and Visual C++ on Windows.</p>
<p>Preprocessing portability of a source is, however, not necessarily sufficient, even if GCC and Visual C pass it.
In the sections below, I will check some problems, sometimes comparing them with glibc sources.
I omit explanations on GCC's problems here to avoid duplication.
For GCC's problems, refer to <a href="#3.9.4">3.9.4</a>, <a href="#3.9.8">3.9.8
</a>, which also comment on glibc sources. *2</p>
<p>Note:</p>
<p>*1 I checked out the sources from CVS repository of mozilla.org.
One of the motivations to compile firefox source was to test -K option of <b>mcpp</b>.
This option was proposed by Taras Glek, and he was working on refactoring of C/C++ source at mozilla.com.
So, I also used firefox source to test -K option and other behaviors of <b>mcpp</b>.
About -K (-Kv) option, refer to <a href="#2.4">2.4</a>.</p>
<p>*2 There is a list of coding-guidelines for firefox as below.
But, its content is too old.<br>
<a href="http://www.mozilla.org/hacking/portable-cpp.html">portable-cpp</a></p>
 
<h4>3.9.11.1. GCC-local specifications are rarely used</h4>
<p>The following GCC-local-specs, which are sometimes used in glibc sources, are not used in firefox sources.
Though compiling firefox on Linux includes system headers, and some of which contain such as GCC2-spec variadic macros, they are not firefox sources themselves.</p>
<ul>
<li>GCC-local variadic macro<br>
Even C99 spec variadic macro is not yet used, too.
</ul>
<p>The following features are not used even in recent glibc, and not used in firefox at all.</p>
<ul>
<li>Multi-line string-literal
<li>#warning
<li>-isystem, -I- options
</ul>

<h4>3.9.11.2. #include_next</h4>
<p>However, a lot of #include_next are found only in one directory: <samp>config/system_wrappers/</samp>, which is generated by configure.
All of the 900 files generated in the directory are short header files of the same pattern.
For example, <samp>stdio.h</samp> is:</p>
<pre>
#pragma GCC system_header
#pragma GCC visibility push(default)
#include_next &lt;stdio.h&gt;
#pragma GCC visibility pop
</pre>
<p>This is a code to utilize '<samp>#pragma GCC visibility *</samp>' directive implemented in GCC 4.*.
At the same time, there is a file '<samp>config/gcc_hidden.h</samp>' as below.
The file is specified by -include option for most of the translation units, and read in at the start of the units.</p>
<pre>
#pragma GCC visibility push(hidden)
</pre>
<p><samp>system_wrappers</samp> directory should be the include directory with the highest priority, so you should specify it as a first argument of -I option.
In spite of such a constraint, this usage of #include_next is simple and seems to has no problem.</p>
<p>On the other hand, for many sources in <samp>nsprpub</samp> directory, '<samp>-fvisibility=hidden</samp>' option is used instead of '<samp>-include gcc_hidden.h</samp>', and the headers in <samp>system_wrappers</samp> are not used.
This <samp>nsprpub</samp> directory seems still to be reorganized.</p>

<h4>3.9.11.3. Using C99 specs without specifying C99</h4>
<p>Many sources use C99 specifications without specifying C99.
GCC use "gnu89" spec by default on *.c source, which is a compromising spec of C90 plus some of C99 specs and GCC-local specs.
Some of firefox sources use the following C99 specs implicitly, depending on GCC's default behavior.</p>
<ul>
<li><p>Empty argument in macro call<p>
<p>Though empty argument in macro call is rare in firefox, these 3 files have it.
The actual macro called with any empty argument is only one named <tt>NS_ENSURE_TRUE</tt>.</p>
<p><samp>layout/style/nsHTMLStyleSheet.cpp, layout/generic/nsObjectFrame.cpp, intl/uconv/src/nsGREResProperties.cpp</samp></p>
<p>Also the following files in <samp>gfx/cairo/cairo/src/</samp> have it.
The actual macro is only one: <tt>slim_hidden_ulp2</tt>.</p>
<p><samp>cairoint.h, cairo-font-face.c, cairo-font-options.c, cairo-ft-font.c, cairo-ft-private.h, cairo-image-surface.c, cairo-matrix.c, cairo-matrix.c, cairo-pattern.c, cairo-scaled-font.c, cairo-surface.c, cairo-xlib-surface.c, cairo.c</samp></p>
<p>Though these empty macro arguments are used on Linux, they are not used on Mac OS X.
Anyway, these are not tricky ones.</p>
<li><p>Translation limits beyond C90<p>
<p>Length of an identifier, nesting level of #include, number of macro definitions and so forth often exceed C90 translation limits.<br>
Identifiers longer than 31 bytes are found especially frequently in the directory <samp>gfx/cairo/cairo/src/</samp>.
Nesting of #include over than 8 level and macro definitions over than 1024 are often found, too.
These are almost inevitable on Linux and Mac OS X, since only inclusion of some system headers often reaches these limits.</p>
<li><p>Using <samp>//</samp> comment in C source</p>
<p>Some C sources have this type of comments.
The <a href="http://www.mozilla.org/hacking/portable-cpp.html">list of guidelines</a> prohibits this.
However, this causes few problem nowadays.</p>
</ul>
<p>Above specifications are also available on Visual C 2005, 2008.
Since GCC has <samp>-std=c99</samp> option, we might use this to specify C99 explicitly.
Visual C, however, has no option to specify a version of Standard.
We cannot help to use C99 specs implicitly.
Therefore, firefox sources cannot be blamed for using C99 specs implicitly, in the current states of the major compiler-systems. *1</p>
<p>By the way, firefox sources do not use variadic macro for some reason, in spite of using some other C99 specs implicitly.
Visual C up to 2003 did not implement variadic macro.
Is that why firefox did not use the feature?
The circumstances has changed since Visual C 2005 implemented it.</p>
<p>Note:</p>
<p>*1 On C++, GCC defaults to "gnu++98" spec, which is explained as "C++98 plus GCC extensions".  It has in actual, however, some C99 specs mixed.
Meanwhile, Visual C says that it is based on C90 and C++98 for C and C++ respectively.
In actual, both of C and C++ of Visual C have C99 features mixed in it as well as a few Visual C extensions, especially in Visual C 2005 and 2008.
Both of GCC and Visual C have such mixture of versions of Standard and their own extensions and modifications, thus bring about some ambiguities.
The absence of option in Visual C to specify a version of Standard is the most inconvenient problem.</p>

<h4>3.9.11.4. Object-like macro replaced with function-like macro name</h4>
<p>Object-like macro replaced with function-like macro name is found sometimes in many other programs, and is found also in firefox sources below, though not frequently.</p>
<samp>content/base/src/nsTextFragment.h,
modules/libimg/png/mozpngconf.h,
modules/libjar/zipstub.h,
modules/libpr0n/src/imgLoader.h,
nsprpub/pr/include/obsolete/protypes.h,
nsprpub/pr/include/private/primpl.h,
nsprpub/pr/include/prtypes.h,
parser/expat/lib/xmlparse.c,
security/nss/lib/jar/jarver.c
security/nss/lib/util/secport.h,
xpcom/glue/nsISupportsImpl.h
</samp>
<p>In addition, building of firefox creates, in a directory for developing environment, many links to header files, which are copied into /usr/include/firefox-VERSION/ when you install developing environment for firefox.
Some of these header files have symbolic links to the above files.
<samp>mozilla-confic.h</samp>, that is created by configure, has a macro definition of this kind, too.</p>
<p>These macros should be written as function-like macro to improve readability.
In actual, many other macros in firefox sources are defined as function-like macro replacing to another function-like macro with the same arguments.
There are coding style differences among the authors.
It would be better to set a coding guideline on this matter.</p>

<h4>3.9.11.5. Macro expanded to 'defined'</h4>
<p>A Macro with 'defined' token in its replacement text, sometimes found in glibc, is found in firefox only once.</p>
<p><samp>modules/oji/src/nsJVMConfigManagerUnix.cpp</samp>
defines a macro as:
<pre>
#define NS_COMPILER_GNUC3 defined(__GXX_ABI_VERSION) && \
                          (__GXX_ABI_VERSION >= 102) /* G++ V3 ABI */
</pre>
and uses it in itself as:</p>
<pre>
#if (NS_COMPILER_GNUC3)
</pre>
<p>This macro should be removed and the #if line should be rewritten as:</p>
<pre>
#if defined(__GXX_ABI_VERSION) && (__GXX_ABI_VERSION >= 102) /* G++ V3 ABI */
</pre>
<p>Maybe this file is to be compiled only by GCC, nevertheless it is not good practice to depend on preprocessor's wrong implementation.</p>
<p>Note:</p>
<p>*1 GCC-specific-build of <b>mcpp</b> V.2.7 enabled GCC-like handling of 'defined' in macro on #if line.
But <b>mcpp</b> warns at it, and you would better to revise the code.</p>

<h4>3.9.11.6. Tokens in #endif line</h4>
<p>The following files in <samp>jpeg</samp> directory have #endif lines with comments without comment mark.
All of the lines has appeared by some recent updates.</p>
<pre>
jmorecfg.h, jconfig.h, jdapimin.c, jdcolor.c, jdmaster.c
</pre>
<p>Though this style of writing was frequently seen in some sources for UNIX-like systems up until middle of 1990s, it has almost completely disappeared nowadays, and cannot be found even in that glibc sources.
GCC usually warns at it as expected.
For all that, these sources take such a writing style.
Only Apple-GCC does not warn at it.
Have these sources been edited on Mac OS?</p>

<h4>3.9.11.7. Assembler Source Which Needs Preprocessing</h4>
<p>The assembler sources are written as *.s (*.asm) files, and some of which contain macros, but in principle, they do not call for preprocessor.</p>
<p>On Mac OS X / ppc, however, there is only one exception.
<samp>xpcom/reflect/xptcall/src/md/unix/xptcinvoke_asm_ppc_rhapsody.s</samp> calls for preprocessor, because it has a #if block containing only one line.
The block seems to be unnecessary already.</p>

<h4>3.9.11.8. -include option</h4>
<p>Compilation of firefox begins with <samp>configure</samp>, which generates <samp>mozilla-config.h</samp>.
In compilation of most of the sources, this header file is specified by -include option.
<samp>config/gcc_hidden.h</samp> is also specified similarly.
Why don't the sources #include these headers at their top?</p>

<h4>3.9.11.9. Redefinition of macro</h4>
<p>Some silent redefinition of macros are found, though they are rare.</p>
<ul>
<li><p>In compilation of most of the sources, <samp>-DZLIB_INTERNAL</samp> option is specified.
In other words, the macro is defined as 1.
It is, however, defined by some sources in <samp>modules/zlib/src/</samp> as:</p>
<pre>
#define ZLIB_INTERNAL
</pre>
<p>It is defined to zero token.
And it is used as:
<pre>
#    ifdef ZLIB_INTERNAL
</pre>
<p>Though the difference does not make different result in this case, different definitions of the same macro is not recommended.
Maybe the option by Makefile is redundant.</p>
<li><p><samp>xpcom/build/nsXPCOMPrivate.h</samp> defines a macro <tt>MAXPATHLEN</tt> differently from <samp>/usr/include/sys/param.h</samp>.
This discrepancy stems from an inconsistency among the related header files about whether include <samp>/usr/include/sys/param.h</samp> or not.
The related header files should be reorganized.</p>
<li><p>On Mac OS X, <tt>assert</tt> macro, once defined in <samp>/usr/include/assert.h</samp>, is redefined in <samp>netwerk/dns/src/nsIDNKitInterface.h</samp>.
'<samp>#undef assert</samp>' should precede it.</p>
<li><p>On Mac OS X, in <samp>modules/libreg/src/VerReg.c</samp>, queer redefinition of macro <tt>VR_FILE_SEP</tt> occurs as:</p>
<pre>
#if defined(XP_MAC) || defined(XP_MACOSX)
#define VR_FILE_SEP ':'
#endif
#ifdef XP_UNIX
#define VR_FILE_SEP '/'
#endif
</pre>
<p>, because on Mac OS X, configure defines both of <tt>XP_MACOSX</tt> and <tt>XP_UNIX</tt>.
This redefinition may be an intended one.
Anyway, it is misleading.
It would be better to write as below, clearly showing the priority of <tt>XP_UNIX</tt>.
<pre>
#ifdef XP_UNIX
#define VR_FILE_SEP '/'
#elif defined(XP_MAC) || defined(XP_MACOSX)
#define VR_FILE_SEP ':'
#endif
</pre>
</ul>

<h4>3.9.11.10. Too long comments</h4>
<p>The following files have too long comments crossing over several hundred lines or more.</p>
<p><samp>extensions/universalchardet/src/base/Big5Freq.tab, extensions/universalchardet/src/base/EUCKRFreq.tab,intl/unicharutil/src/ignorables_abjadpoints.x-ccmap, layout/generic/punct_marks.ccmap</samp></p>
<p>Especially, in the directories <samp>intl/uconv/ucv*/</samp>, there are many files with too long comments.
There is even a case of single comment crossing over 8000 lines!
All of these files have name of *.uf or *.ut, and are mapping tables between Unicode and each Asian encodings, generated automatically by some tool.
They do not seem to be source of C/C++, but they are included from other C++ sources.
Most part of these files are comments, which seem to be a sort of document or table for some other tool.</p>
<p>It is not recommendable to include long documents or tables in source files.
They should be separated from source files, even if placed in source tree.</p>
<p>Though these files are used in Linux, they are not used in Mac OS X.
On the other hand, on Mac OS X, system headers in framework directories are frequently used, and some of them are queer files mostly occupied with comments.</p>

<h4>3.9.11.11. Mixed encodings of newline</h4>
<p>The encoding of newline in firefox source is [LF].
A few files, however, have a small block of lines ending with [CR][LF].
All of these [CR][LF] lines seem to be fragments inserted as patches.
Some conversion tools should be used when one edit source file on Windows.</p>
<br>

<h2><a name="3.10" href="#toc.3.10">3.10. Visual C++ System Header Problems</a></h2>
<p>I used <b>mcpp</b> to preprocess some sample programs provided by Visual C++ 2003, 2005 and 2008.  The system headers seem to have only a few compatibility problems shown below.  These problems are often seen in other compile systems and do not have a serious impact on preprocessing.</p>
<ol>
<li>Since when Visual C++ scarcely implemented the C99 specifications, // comments have often been used in C source code.<br>
<li>Object-like macro definitions that are expanded into function-like macro names are sometimes found.<br>
<li>On Visual C++ 2003, there was one wrong macro definition in limits.h. (It was revised on Visual C++ 2005.  See Note 2 in <a href="cpp-test.html#5.1.3.1)"> cpp-test.html#5.1.3.1)</a><br>
</ol>
<p>Although the Linux system-headers and glibc sources often contain GCC local specification based coding, Visual C++ system headers has only a few Visual C++ local coding.</p>

<h3><a name="3.10.1" href="#toc.3.10.1">3.10.1. Comment Generating Macro?</a></h3>
<p>I found only one outrageous macro in Visual C++.  <samp>Vc7/PlatformSDK/Include/WTypes.h</samp> has the following macro definition: *1</p>
<pre>
#define _VARIANT_BOOL   /##/
</pre>
<p>This macro definition is used in oaidl.h and propidl.h in <samp>Vc7/PlatformSDK/Include/</samp> as follows:</p>
<pre>
_VARIANT_BOOL bool;
</pre>
<p>What does this macro aim at?</p>
<p>This macro seems to expect <tt>_VARIANT_BOOL</tt> to be expanded into // and the line to be commented out.  Actually, this expectation is met in Visual C cl.exe !</p>
<p>In the first place, // is not a token (preprocessing-token).  Macro definitions should be processed and expanded after source are parsed into tokens and a comment is converted into one space.  Therefore, it is irrational for a macro to generate comments.  When this macro is expanded into //, the result is undefined because // is not a valid preprocessing-token.</p>
<p>In order to use these header files with <b>mcpp</b>, comment out these macro definitions and change many <tt>_VARIANT_BOOL</tt> occurrences as follows:</p>
<pre>
#if !__STDC__ &amp;&amp; (_MSC_VER &lt;= 1000)
    _VARIANT_BOOL bool;
#endif
</pre>
<p>If you use only Visual C 5.0 or later, this line can be simply commented out as follows:</p>
<pre>
// _VARIANT_BOOL bool;
</pre>
<p>This macro is, indeed, out of question, however, it is Visual C/cl.exe, which allows such an outrageous macro to be preprocessed as a comment, should be blamed.  This example reveals the following serious problems this preprocessor has:</p>
<ol>
<li>Preprocessing is not token-based but character-based at least in this example.<br>
<li>The macro expansion result is treated as comment, which indicates the translation phases are confusing.<br>
</ol>
<p>Probably, the cl.exe preprocessor was developed based on a very old somewhat character-based preprocessor.  It is easy to presume that the preprocessor has been upgraded by repeating partial revision to the old preprocessor.</p>
<p>There are many preprocessors which presumably have a very old program structure.  GCC 2/cpp, shown in 3.9, is one of such preprocessors.  Repeated partial revision of such a preprocessor will only makes its program structure more complicated.  However much such revision may be made, there are limits to quality such preprocessor can achieve.  Unless a old source is given up and completely rewritten, a clear and well-structured preprocessor cannot be obtained.</p>
<p>At GCC 3/cpp0, a total revision was made to GCC 2; the entire source code was rewritten.  So, GCC 3/cpp0 has become quite different from GCC 2. Although <b>mcpp</b> was initially developed based on the source of an old preprocessor, DECUS cpp, the source code was totally rewritten soon.</p>
<p>Note:</p>
<p>*1 Visual C++ 2005 Express Edition does not contain Platform SDK.  However, you can download "Platform SDK for Windows 2003", and use it with VC2005.  <samp>wtypes.h, oaidl.h, propidl.h</samp> in this <samp>PlatformSDK/Include</samp> directory also have the same macro definition and its usage as VC2003 Platform SDK.<br>
Also on Visual C++ 2008, in the header files of the same name in '<samp>Microsoft SDKs/Windows/v6.0A/Include</samp>' directory, that macro definition and its usage are quite the same.</p>

<h3><a name="3.10.2" href="#toc.3.10.2">3.10.2. '$' in Identifiers</a></h3>
<p>Another problem is use of '$' in identifiers.
Its use in macro names suddenly increased in the system headers on Visual C++ 2008.
Though such macros were also found on Visual C++ 2005, they were rare.
But, on Visual C++ 2008, they are found here and there.</p>
<p>'<samp>Microsoft Visual Studio 9.0/VC/include/sal.h</samp>' is the most conspicuous one.
This header defines macros for so-called SAL (standard source code annotation language) of Microsoft, and has many names containing '$'.
This file is included from many standard headers via '<samp>Microsoft Visual Studio 9.0/VC/include/crtdefs.h</samp>', so most sources are compiled with these macros without knowing it.</p>
<p>If you specify -Za option to invoke the compiler cl, the SAL is disabled and all of the names with '$' in the <samp>sal.h</samp> disappear.
The necessity of this notation is, however, hard to understand.
Though GCC also enables '$' in identifiers by default, its actual use is rarely found nowadays.</p>
<p>This kind of names are also found in the system headers named <samp>specstrings*.h</samp> in '<samp>Microsoft SDKs/Windows/v6.0A/Include</samp>' directory.
They are included from <samp>Windows.h</samp> via <samp>WinDef.h</samp>, and the names with '$' do not disappear even -Za option is specified.
The option causes only errors.
So, you cannot use the -Za option to compile a source which includes <samp>Windows.h</samp>.</p>
<br>

<h1><a name="4" href="#toc.4">4. Implementation-defined Behaviors</a></h1>
<p>This chapter does not contain all the C preprocessor specifications.  For details on Standard C preprocessing, refer to cpp-test.html.  For <b>mcpp</b> behaviors in each mode, refer to <a href="#2.1"> 2.1</a>.  This chapter covers several preprocessor-related specifications, including those called implementation-defined by Standards.  For more details on <b>mcpp</b> implementation-defined-behaviors, see Chapter 5, "Diagnostic Messages".</p>
<br>

<h2><a name="4.1" href="#toc.4.1">4.1. Status Value on Exit</a></h2>
<p>The header file internal.H defines values returned by <b>mcpp</b> to a parent process.  <b>mcpp</b> returns 0 on success, and errno for errno != 0 and 1 for errno == 0 on error.  Success means that no error has occurred.</p>
<br>

<h2><a name="4.2" href="#toc.4.2">4.2. Include Directory Search Path</a></h2>
<p>This section explains the order in which <b>mcpp</b> searches directories for an include file when it encounters a #include directive.</p>
<ol>
<li>If a #include directive argument takes A Form of neither "file-name" nor &lt;file-name&gt;, and is a macro, the macro is expanded.  The resulting filename must take a form of either "file-name" or &lt;file-name&gt;.  Otherwise, it causes an error.<br>
<br>
<li>If the resulting filename, either in form of "file-name" or &lt;file-name&gt;, is a full path name, <b>mcpp</b> tries to open it.  If it fails, it causes an error.<br>
<br>
<li>If the resulting filename is not a full path but takes a form of "file-name", <b>mcpp</b> regards it as a filename relative from the current directory or source file directory, and begins searching from that directory.  The former is a directory from which <b>mcpp</b> was invoked and the latter is a directory where the source file that includes the "file-name" resides.  Depending on the specified options and compiler systems, <b>mcpp</b> begins searching directories as follows:<br>
<br>
<ol>
<li>If -I1 is specified, search begins from current directory.
<li>If -I2 is specified, source file directory.
<li>If -I3 is specified, current first and then source file directory.
</ol>
<br>
By default, the compiler-specific-builds for UNIX compiler systems, GCC or Visual C begin searching from the source file directory.
The other compiler-specific-builds begin searching at the current directory.  However, Borland C-specific-build searches current first and then source file directory.
The compiler-independent-build of <b>mcpp</b> begins search from the source file directory.<br>
For GCC, the directories specified by -iqoute option are searched next.
For Visual C, also the directories of each ancestor file of source file are searched one by one.<br>
If <b>mcpp</b> fails to find the desired file, it begins searching as shown in step 4.<br>
<br>
In case of a nested #include, if search begins at current directory, the base directory is always the same.  If search begins at a source file directory, the base directory changes each time a header file resides in other directory.<br>
<br>
<li>If the resulting filename is not a full path name but takes a form of &lt;file-name&gt;, <b>mcpp</b> searches directories in the following order.  If any of these directories are specified as a relative path, then <b>mcpp</b> regards it as a relative directory from the current directory at <b>mcpp</b> startup.  If <b>mcpp</b> fails to find or open the desired file after searching all the directories in these order, it causes an error.<br>
<ol>
<br>
<li>Directory(s) specified with the -I &lt;directory&gt; option on <b>mcpp</b> invocation.  If several directories are specified, they are searched in the order in which specified (with the left first).
<li>For GCC-specific-build, directories specified with the -isystem option.  If several directories are specified, they are searched in the order specified (from the left).
<li>Directories specified with an environment variable.  <i>ENV_C_INCLUDE_DIR</i> in noconfig.H (configed.H) defines environment variable names.  In C++, <i>ENV_CPLUS_INCLUDE_DIR</i>, if defined, takes precedence over <i>ENV_C_INCLUDE_DIR</i>.  GCC-specific-build uses C_INCLUDE_PATH (and also CPLUS_INCLUDE_PATH for C++) as default environment variable.  Other <b>mcpp</b> uses <samp>INCLUDE</samp> (and also <samp>CPLUS_INCLUDE</samp> for C++) as default.  If an environment variable specifies several directories with each separated with a delimiter, they are searched in the order in which specified.  Windows and other OSs use ";" and ":" as delimiter, respectively.
<li>Implementation-specific directories for C++ defined by the <i>CPLUS_INCLUDE_DIR?</i> macros in noconfig.H (configed.H).
<li>Site-specific directories defined by setsysdirs() in system.c (For UNIX systems, <samp>/usr/local/include</samp>).
<li>Implementation-specific directories defined by the <i>C_INCLUDE_DIR?</i> macros in noconfig.H (configed.H).
<li>System-specific directories defined by setsysdirs() in system.c (For UNIX systems, <samp>/usr/include</samp>).
</ol>
</ol>
<p>With the -I- option (-nostdinc option for GCC-specific-build and -X for Visual C-specific-build), the directories specified in 4.4 and later are not searched.</p>
<p>ANSI C Rationale says the ANSI committee intends to define a current directory as base directory.  I think this is acceptable, in that the base directory is always constant and that the specification is clearer.  However, some implementations, such as UNIX, seem to define a source file directory as base one at least for #include "header".  The compiler-independent-build of <b>mcpp</b> also takes source file directory as base, according to the majority.</p>
<br>

<h2><a name="4.3" href="#toc.4.3">4.3. How to Construct Header Name</a></h2>
<p>This section explains how to construct a header-name pp-token and extract a file name from it.</p>
<ol>
<li>If source code contains a header file name in the string literal format, <b>mcpp</b> regards it as a header-name and removes the " at the both ends to construct a filename.  This can be applied to a string literal resulting from macro expansion in source code.<br>
<br>
<li>If source code contains a header file name in the &lt;filename&gt; format, <b>mcpp</b> regards it as a header-name and removes the &lt; and &gt; at the both ends to construct a filename.  This can be applied to a &lt;filename&gt; format sequence resulting from macro expansion.  The spaces in the macro are retained squeezing multiple spaces into one space.<br>
<br>
<li>In any case, <b>mcpp</b> converts \ to /, although both of "\" and "/" can be used as path delimiters on Windows.<br>
<br>
</ol>
<br>

<h2><a name="4.4" href="#toc.4.4">4.4. Evaluation of #if Expression</a></h2>
<p>Evaluation of #if expression depends on the largest integer type of the host compiler (by which <b>mcpp</b> was compiled) and that of the target compiler (which uses <b>mcpp</b>).  Since the compiler-independent-build has no target compiler, the type depends only on the host compiler.</p>
<p><b>mcpp</b> in Standard mode evaluates #if expression in the common largest integer type of the host and target compiler.  Nevertheless, <b>mcpp</b> in pre-Standard mode evaluates it in (signed) long.</p>
<p>In the compiler-systems having type "long long", if <tt>__STDC_VERSION__</tt> is set to 199901L or higher using the -V199901L option, <b>mcpp</b> evaluates a #if expression in "long long" or "unsigned long long", according to the C99 specification.  Although C90 and C++98 stipulate that #if expression is evaluated in long / unsigned long, <b>mcpp</b> evaluate it in long long / unsigned long long even in C90 or C++98 mode, and issues a warning in case of the value overflows the range of long / unsigned long. *1</p>
<p>Visual C and Borland C 5.5 do not have a "long long" type, but have an __int64 type of the same length.  So, a #if expression is evaluated as __int64 / unsigned __int64.  (However, since LL and ULL suffixes cannot be used in Visual C++ 2002 or earlier and Borland C 5.5, these suffixes must not be used in coding other than #if lines.)</p>
<p>In addition, when you invoke with the -+ option for C++ preprocessing, <b>mcpp</b> evaluates pp-tokens 'true' and 'false' in a #if expression to 1LL (or 1L) and 0LL (or 0L), respectively.</p>
<p><b>mcpp</b> in Standard mode evaluates #if expression as follows.  For a compiler without long long, please read "long long" and "unsigned long long" hereinafter, until the end of 4.5, as "long" and "unsigned long", respectively.  For pre-Standard mode read all of them as "long".</p>
<ol>
<li>An integer constant token with a U suffix, including character constants, is evaluated in unsigned long long. (Note that pre-Standard mode does not recognize the U suffix).<br>
<li>Otherwise, a token within the range of non-negative long long is evaluated in long long.<br>
<li>Otherwise, a token within the range of unsigned long long is evaluated in unsigned long long.<br>
<li>Otherwise, it is diagnosed as an out of range error.<br>
<li>In a binary operation, if either operand is unsigned long long, both are converted to unsigned long long.  Otherwise, an operation is performed in signed long long.<br>
</ol>
<p>Anyway, an integer constant token always has a non-negative value.<br>
In pre-Standard mode, an integer constant token is evaluated within the range of non-negative long.  A token beyond that range is diagnosed as an out of range error.  All the operations are performed within the range of long.</p>
<p>If both of host and target compilers have type unsigned long long and the range of unsigned long long of the host is narrower than that of the target, a beyond that host range is evaluated to an out of range error.</p>
<p>If an operation using constant tokens produces a result out of range of long long, an out of range error occurs.  If it produces a result out of range of unsigned long long, a warning is issued.  This can be applied to intermediate operation results.</p>
<p>Since a bitwise right shift of a negative value or a division operation using it does not provide portability, <b>mcpp</b> issues a warning.  If an operation using a mixture of unsigned and signed operands converts a signed negative value to an unsigned positive value, a warning is also issued.  How these values are evaluated depends on the specification of the compiler-proper of the host system.</p>
<p>C90 and C++98 makes it a rule that a preprocessor evaluates a #if expression in long/unsigned long (in C99, the maximum integer type is used).  These specifications are rougher than those of compiler-propers.  A (#)if expression is often evaluated differently between preprocessor and compiler-proper, especially when sign extension is involved.</p>
<p>In addition, since keywords are not used during Standard C preprocessing, sizeof or cast cannot be used in a #if expression.  Of course, neither variables, enumeration constants, nor floating point numbers can be used there.  Standard mode allows the "defined" operator in a #if expression as well as the #elif directive.  Except for these differences, <b>mcpp</b> evaluates a #if expression in accordance with priority of and the associative law among operators, just as compiler-propers do.  In a binary operation, an arithmetic conversion often takes place to equalize the types on both-hand sides; If one operand is unsigned long long and the other is long long, the both are converted to unsigned long long.</p>
<p>Note:</p>
<p>*1 <b>mcpp</b> up to V.2.5 evaluated #if expression in C90 and C++98 by long long / unsigned long long internally, and issued an error on overflow of long / unsigned long.  From V.2.6 onward, <b>mcpp</b> degraded the error to warning for compatibility with GCC or Visual C.</p>
<br>

<h2><a name="4.5" href="#toc.4.5">4.5. Character Constant Evaluation in #if Expression</a></h2>
<p>Constant tokens in a #if expression includes identifiers (macros and non-macros), integer tokens and character constants.  How to evaluate character constants is implementation-defined and lacks of portability.  Even (#)if 'const' is sometimes evaluated differently between preprocessor and compiler-proper.  Note that Standards does not even guarantee that (#)if 'const' is evaluated to the same.</p>
<p><b>mcpp</b> in <i>POSTSTD</i> mode does not evaluate a character constant in a #if expression, which is almost meaningless, and makes it an error.</p>
<p>Like other integer constant tokens, <b>mcpp</b> evaluates a character constant in a #if expression within the range of long long or unsigned long long. (In pre-Standard mode, long only.)</p>
<p>A multi-byte character or a wide character is generally evaluated with 2-bytes type, except for the UTF-8 encoding, which is evaluated with 4-bytes type.  Since UTF-8 has a variable length, <b>mcpp</b> evaluates it with 4-byte type.  <b>mcpp</b> does not support EUC's 3 byte encoding scheme. (A 3-byte character is recognized as 1 byte + 2 bytes.  As a consequence, its value is evaluated correctly.)  Although there are some implementations using the 2-byte encoding scheme that define wchar_t as 4-byte, <b>mcpp</b> has no relevance to wchar_t.  The following paragraphs describe two-byte multi-byte character encodings.</p>
<p>Multi-byte character constants, such as 'X', are evaluated to ((First byte value &lt;&lt; 8) + Second byte value).  (8 is the value of <tt>CHAR_BIT</tt> in &lt;limits.h&gt;.)  Note that 'X' is used here to designate a multi-byte character.  Though 'X' itself is not a multi-byte character, it is used here to avoid character garbling.</p>
<p>Let me take an example of multi-character character constants, such as 'ab', '\x12\x3', and '\x123\x45'.  'a', 'b', '\x12', '\x3' and '\x123' are regarded as one byte.  When a multi-character character constant is evaluated, each one byte, starting from the highest one, is evaluated within the range of [0, 0xFF] and combined by shifting it to left by 8.  (0xFF is the value of <tt>UCHAR_MAX</tt> in &lt;limits.h&gt;.)  If the value of one escape sequence exceeds 0xFF, an out of range error occurs.  Therefore, in the implementation of the ASCII character set, the above three tokens are evaluated to 0x6162, 0x1203 and error, respectively.</p>
<p>L'X' is evaluated to the same value as 'X'.  Let me take an example of multi-character wide character constants, such as L'ab', L'\x12\x3', and L'\x123\x45'.  L'a', L'b', L'\x12', L'\x3', L'\x123', and L'\x45' are regarded as one wide character.  When a multi-character wide character constant is evaluated, each wide character, starting from the highest one, is evaluated within the range of [0, 0xFFFF] and combined by shifting it to left by 16.  If the value of one escape sequence exceeds the maximum value of an unsigned 2-byte integer, an out of range error occurs.  Therefore, in the implementation of the ASCII character set, the above three tokens are evaluated to 0x00610062, 0x00120003, and 0x01230045, respectively.</p>
<p>If the values of a multi-character character constant and a multi-character wide character constant exceed the range of unsigned long long, an out of range error occurs.</p>
<p>With <tt>__STDC_VERSION__</tt> or <tt>__cplusplus</tt> set to 199901L or higher, <b>mcpp</b> evaluates a Universal Character Name (UCN) in the form of \uxxxx and \Uxxxxxxxx as a hex escape sequence. (I know this evaluation is nonsense but no other way.)</p>
<p>If the compiler-proper of the target compiler system uses a signed char or signed wchar_t, a character constant in a (#)if expression may be evaluated differently between <b>mcpp</b> and compiler-proper.  The range that causes a range error may also differ between them.  In addition, evaluation of multi-character character constants and multi-byte character constants varies even among preprocessors and among compilers.  Standard C does not define whether, with <tt>CHAR_BIT</tt> set to 8, 'ab' is evaluated to 'a' * 256 +'b' or 'a' + 'b' * 256.</p>
<p>In general, character constants should not be used in an #if expression, as long as you have an alternative method.  I think an alternative method always exists.</p>
<br>

<h2><a name="4.6" href="#toc.4.6">4.6. #if sizeof (type)</a></h2>
<p>Standard C stipulates that preprocessing is a process independent of run-time environments or compiler-proper specifications, thus prohibiting it from using sizeof and cast in an #if expression.  However, pre-Standard mode allows sizeof (type) in a #if expression.  This was done as a part of my effort to add necessary modifications to DECUS cpp, such as adding long long and long double processing, while retaining its original functionality.  As to cast, I neither implemented nor had a will to do so because it would require troublesome work.</p>
<p>A series of macros beginning with S_, such as <i>S_CHAR</i>, in eval.c define the size of each type.  Under cross implementation, these macros must be modified to specify size of the types, in integer values, used in the target system.</p>
<p>I have to admit that <b>mcpp</b> does not provide the full functionality of #if sizeof.  <b>mcpp</b> just ignores the letter of "signed" or "unsigned" preceding char, short, int, long, and long long when it appears in a #if sizeof.  Also <b>mcpp</b> does not support sizeof (void *).  I know this is a half-hearted implementation but I do not want to increase the number of flags in system.H in vain for this non-conforming function.  I initially thought of removing the sizeof code from the original version because I did not intend to support cast at all, but on the second thought, I decided to make a small amount of modifications to make use of the existing code.</p>
<br>

<h2><a name="4.7" href="#toc.4.7">4.7. How to Handle White-Space Sequence</a></h2>
<p><b>mcpp</b> in principle compresses a white-space sequence, excluding &lt;newline&gt;, as a token separator into one space character during tokenization in the translation phase 3. If -k or -K option is specified in <i>STD</i> mode, however, it outputs horizontal white spaces as they are without compressing.  It also deletes a white-space sequence at the end of a line.</p>
<p>A white-space sequence at the beginning of a line is deleted in <i>POSTSTD</i> mode, and putout as they are in other modes.  The latter is special treatment for convenience of human reading. *1</p>
<p>This compression and deletion occurs during the intermediate phase.  The next phase 4 involves macro expansion and preprocess-directive-line processing.  Macro expansion may sometimes produce several space characters before and after the macro.  Of course, the number of space characters does not affect compilation results.</p>
<p>Standard C says that whether implementation compresses a white-space sequence into one space character during the translation phase 3 is implementation-defined, but you usually do not have to worry about this.  &lt;Vertical-tab&gt; or &lt;form-feed&gt; in a preprocessor directive line may adversely affect portability, since this is undefined in Standard C.  <b>mcpp</b> converts it to one space character.</p>
<p>Note:</p>
<p>*1 Up to V.2.6.3 <b>mcpp</b> squeezed line top white spaces into one space.
In V.2.6.4, it changed the behavior.</p>
<br>

<h2><a name="4.8" href="#toc.4.8">4.8. Default Specifications for <b>mcpp</b> Executables</a></h2>
<p>This section describes the specifications of <b>mcpp</b> executables generated when DIFfile and makefile for each compiler system in the noconfig directory are used to compile <b>mcpp</b> with default settings.  When a configure script is used to compile <b>mcpp</b>, the generated <b>mcpp</b> may differ, depending on configure's results, however, as long as OS and compiler system versions are same, generated <b>mcpp</b>s would be same except for include directories.</p>
<p>The compiler-independent-build of <b>mcpp</b> has the constant specifications regardless of the compiler system with which <b>mcpp</b> was compiled, except a few features dependent on OS and CPU.</p>
<p>There are compiler-independent-build and compiler-specific-build for <b>mcpp</b> executables, and each executable has several behavioral modes.  For those, refer to <a href="#2.1"> 2.1</a>.  This section describes the settings centering on <i>STD</i> mode.</p>
<p>DIFfiles and makefiles are for the following compiler systems:</p>
<blockquote>
<table>
  <tr><th>FreeBSD 6.3        </th><td>GCC V.3.4</td></tr>
  <tr><th>Vine Linux 4.2 / x86          </th><td>GCC V.2.95, V.3.2, V.3.3, V.3.4, V.4.1</td></tr>
  <tr><th>Debian GNU/Linux 4.0 / x86    </th><td>GCC V.4.1</td></tr>
  <tr><th>Ubuntu Linux 8.04 / x86_64    </th><td>GCC V.4.2</td></tr>
  <tr><th>Fedora Linux 9 / x86          </th><td>GCC V.4.3</td></tr>
  <tr><th>Mac OS X Leopard / x86        </th><td>GCC V.4.0</td></tr>
  <tr><th>CygWIN             </th><td>1.3.10 (GCC V.2.95), 1.5.18 (GCC 3.4)</td></tr>
  <tr><th>MinGW &amp; MSYS   </th><td>GCC 3.4</td></tr>
  <tr><th>WIN32              </th><td>LCC-Win32 2003-08, 2006-03</td></tr>
  <tr><th>WIN32              </th><td>Visual C++ 2003, 2005, 2008</td></tr>
  <tr><th>WIN32              </th><td>Borland C++ V.5.5</td></tr>
</table>
</blockquote>
<p>In addition, for the following compilers which I don't have, the difference files contributed from some users are contained here.</p>
<blockquote>
<table>
  <tr><th>WIN32 </th><td>Visual C++ V.6.0, 2002</td></tr>
  <tr><th>WIN32 </th><td>Borland C++ V.5.9 (C++Builder 2007)</td></tr>
</table>
</blockquote>
<p>Of all the macros defined in noconfig.H and system.H, the settings of those mentioned below are identical among every <b>mcpp</b> executable, regardless of their compiler systems.</p>
<p>Each <b>mcpp</b> is compiled with <i>DIGRAPHS_INIT</i> == <i>FALSE</i>, so enables digraph when the -2 (-digraphs) option is specified.<br>
With <i>TRIGRAPHS_INIT</i> == <i>FALSE</i>, trigraph is enabled with the -3 (-trigraphs) option.<br>
With <i>OK_UCN</i> set to <i>TRUE</i>, Universal Character Name (UCN) can be used in C99 and C++.<br>
With <i>OK_MBIDENT</i> set to <i>FALSE</i>, multi-byte-characters cannot be used in identifiers.</p>
<p>With <i>STDC</i> set to 1, the initial value of <tt>__STDC__</tt> is 1.</p>
<p>The translation limits are set as follows.</p>
<blockquote>
<table>
  <tr><th><i>NMACPARS</i> (Maximum number of macro arguments)                   </th><td>255</td></tr>
  <tr><th><i>NEXP</i> (Maximum number of nested levels of #if expressions)      </th><td>256</td></tr>
  <tr><th><i>BLK_NEST</i> (Maximum number of nested levels of #if section)      </th><td>256</td></tr>
  <tr><th><i>RESCAN_LIMIT</i> (Maximum number of nested levels of macro rescans)</th><td>64</td></tr>
  <tr><th><i>IDMAX</i> (Valid length of identifier)                             </th><td>1024</td></tr>
  <tr><th><i>INCLUDE_NEST</i> (Maximum number of #include nest level)           </th><td>256</td></tr>
  <tr><th><i>NBUFF</i> (Maximum length of a source line) *1                     </th><td>65536</td></tr>
  <tr><th><i>NWORK</i> (Maximum length of an output line)                       </th><td>65536</td></tr>
  <tr><th><i>NMACWORK</i> (Size of internal buffers used for macro expansion)   </th><td>262144</td></tr>
</table>
</blockquote>
<p>On GCC-specific-build and Visual C-specific-build, however, <i>NMACWORK</i> is used as the maximum length of an output line.</p>
<p>This macro differs on OS regardless of build types.</p>
<p><i>MBCHAR</i> (Default encoding of multibyte character):</p>
<blockquote>
<table>
  <tr><th>Linux, FreeBSD, Mac OS X  </th><td>EUC-JP</td></tr>
  <tr><th>WIN32, CygWIN, MinGW      </th><td>SJIS</td></tr>
</table>
</blockquote>
<p>The settings of the macros below are different among compiler systems.</p>
<p><i>STDC_VERSION</i> (Initial value of <tt>__STDC_VERSION__</tt>):</p>
<blockquote>
<table>
  <tr><th>Compiler-independent, GCC 2</th><td>199409L</td></tr>
  <tr><th>Others            </th><td>0L</td></tr>
</table>
</blockquote>
<p><i>HAVE_DIGRAPHS</i>   (Is digraphs output as it is?):</p>
<blockquote>
<table>
  <tr><th>Compiler-independent, GCC, Visual C</th><td><i>TRUE</i></td></tr>
  <tr><th>Others                    </th><td><i>FALSE</i></td></tr>
</table>
</blockquote>
<p><i>EXPAND_PRAGMA</i>   (Is a #pragma line macro-expanded in C99?):</p>
<blockquote>
<table>
  <tr><th>Visual C, Borland C</th><td><i>TRUE</i></td></tr>
  <tr><th>Others  </th><td><i>FALSE</i></td></tr>
</table>
</blockquote>
<p>GCC 2.7-2.95 defines <tt>__STDC_VERSION__</tt> to 199409L.  However, in GCC V.3.*,V.4.*, <tt>__STDC_VERSION__</tt> is no longer predefined by default and is now defined in accordance with an execution option.  <b>mcpp</b> setting for GCC follows these variations.</p>
<p>If <i>STDC_VERSION</i> is set to 0L, <b>mcpp</b> predefines <tt>__STDC_VERSION__</tt> as 0L.  So, specifying the -V199409L option sets <tt>__STDC__</tt> and <tt>__STDC_VERSION__</tt> to 1 and 199409L, respectively and allows only predefined macros that begin with '_', resulting in <b>mcpp</b> in the strictly C95 conforming mode.  The -V199901L option specifies C99 mode.</p>
<p>In C99 mode, <b>mcpp</b> predefines <tt>__STDC_HOSTED__</tt> as 1.</p>
<p><b>mcpp</b> itself predefines neither <tt>__STDC_ISO_10646__</tt>, <tt>__STDC_IEC_559__</tt> nor <tt>__STDC_IEC_559_COMPLEX__</tt>.  These values are compiler-system-specific.  In glibc 2 / x86, the system header defines <tt>__STDC_IEC_559__</tt> and <tt>__STDC_IEC_559_COMPLEX__</tt> as 1.  Other compiler systems do not define them.</p>
<p>If <i>HAVE_DIGRAPHS</i> is set to <i>FALSE</i>, digraph is output after converting to usual token.</p>
<p>The argument of #pragma line beginning with STDC, MCPP or GCC is never macro-expanded even if <i>EXPAND_PRAGMA</i> == <i>TRUE</i>.</p>
<p>Include directories are set as follows:</p>
<p>System-specific or site-specific directories under UNIX-like OSs are as follows (common to compiler-independent-build and compiler-specific-build):</p>
<blockquote>
<table>
  <tr><th>FreeBSD, Linux, Mac OS X, CygWIN</th><td>/usr/include,  /usr/local/include</td></tr>
</table>
</blockquote>
<p>Mac OS X has also the framework directories set to /System/Library/Frameworks and /Library/Frameworks by default.</p>
<p>On MinGW, /mingw/include is the default include directory.</p>
<p>CygWIN GCC-specific-build changes /usr/include to /usr/include/mingw by -mno-cygwin option.</p>
<p>For the implementation-specific directories that vary among compiler systems and their versions, see the DIFfiles.  The compiler-independent-build does not set implementation-specific directories.  <b>mcpp</b> for the compiler systems on Windows does not preset any directory but uses the environment variables: INCLUDE, CPLUS_INCLUDE.  These environment variables are used by the compiler-independent-build too.</p>
<p>If these default settings do not suit you, change settings to recompile <b>mcpp</b>, or use environment variables or the -I option.</p>
<p>When the length of a preprocessed line exceeds <i>NWORK</i>-1, <b>mcpp</b> generally divides it into several lines so that each line length becomes equal to or less than <i>NWORK</i>-1.  A string literal length must be equal to or less than <i>NWORK</i>-2.
<b>mcpp</b> of GCC-specific-build and Visual C-specific-build, however, do not divide output line.</p>
<p>Again for confirmation, the macros mentioned above in <i>italics</i> are used only to compile <b>mcpp</b>, and are not built-in macros in a <b>mcpp</b> executable.</p>
<p>If you invoke <b>mcpp</b> without an input file and enter '#pragma MCPP put_defines', the built-in macros will be displayed.</p>
<p>With <tt>__STDC__</tt> set to 1 or higher, the macros that do not begin with '_' are deleted.  The -N (-undef) option deletes all the macros other than <tt>__MCPP</tt>.  After -N, you can use -D to defines macro symbols over again.  When you use a different compiler system version from those specified here, -N and -D allow you to redefine your version macro without recompiling <b>mcpp</b>.  The -D option allows you to redefine a particular macro without using -N or -U.</p>
<p>When you use the -+ (-lang-c++) option to specify C++ preprocessing, <tt>__cplusplus</tt> is predefined with its initial value of 1L.  In addition, some other macros are also predefined:</p>
<p>Although there are some predefined macros in GCC, those predefined by GCC were few, until GCC V.3.2.  Most of them are passed from gcc to cpp by the -D option.  So, it is not necessary for <b>mcpp</b> to define them for compatibility.  However, <b>mcpp</b> predefines these macros for being used in a stand alone manner, such as pre-preprocessing.</p>
<p>GCC V.3.3 and later predefines 60 or 70 macros (suddenly).  GCC-specific-build of <b>mcpp</b> V.2.5 and later for GCC V.3.3 or later also includes these predefined macros other than the above ones.  These GCC-specific predefined macros are written in <samp>mcpp_g*.h</samp> header files, which is generated by installation of <b>mcpp</b>.</p>
<p>Since FreeBSD, Linux, CygWIN, MinGW / GCC and LCC-Win32, Visual C 2008 have a type long long, an #if expression is evaluated in long long or unsigned long long.  Visual C 6.0, 2002, 2003, 2005 and Borland C 5.5 do not have a "long long" type but __int64 and unsigned __int64 instead.  These types are used.</p>
<p>In the above compiler systems with type long ranges:</p>
<pre>
[-2147483647-1, 2147483647] ([-0x7fffffff-1, 0x7fffffff])
</pre>
<p>and unsigned long ranges:</p>
<pre>
[0, 4294967295] ([0, 0xffffffff]).
</pre>
<p>In the compiler systems with type long long ranges:</p>
<pre>
[-9223372036854775807-1, 9223372036854775807]
([-0x7fffffffffffffff-1, 0x7fffffffffffffff]),
</pre>
<p>and unsigned long long ranges:</p>
<pre>
[0, 18446744073709551615] ([0, 0xffffffffffffffff]).
</pre>
<p>All the compiler-propers of the above compiler systems internally represent a signed integer as two's complement number.  So do bit operations.  This can be applied to <b>mcpp</b>'s #if expression.</p>
<p>Right shift of a negative integer is an arithmetic shift.  This can be applied to <b>mcpp</b>'s #if expression. (Right shifting an integer by one bit halves the value with the sign retained)</p>
<p>In an integer division or modulus operation, if either or both operands are negative values, an algebraic operation like Standard C's ldiv() function is performed.  This can be applied to <b>mcpp</b>'s #if expression.</p>
<p>These OSs use the ASCII basic character set.  So does <b>mcpp</b>.</p>
<p>There is a memory management routine, kmmalloc, that I developed.  This routine has malloc(), free(), realloc() and other memory handling functions.  If kmmalloc is installed in systems other than CygWIN or Visual C 2005 or 2008, kmmalloc is linked when the MALLOC=KMMALLOC (or -DKMMALLOC=1) option is specified in make.  Also its heap memory debugging routine is linked.  <b>mcpp</b> for Linux and LCC-Win32 uses <i>EFREEP</i>, <i>EFREEBLK</i>, <i>EALLOCBLK</i>, <i>EFREEWRT</i> and <i>ETRAILWRT</i> with an errno of 2120, 2121,2122, 2123 and 2124 assigned, and other <b>mcpp</b> uses 120, 121, 122, 123, and 124. (Refer to <a href="mcpp-porting.html#4.extra"> mcpp-porting.html#4.extra</a>.)  *2</p>
<p>On the systems other than GNU and Visual C, you should preset the environment variable TZ, for example JST-9 in Japan. Or, the <tt>__DATE__</tt> and  <tt>__TIME__</tt> macros are not set correctly.</p>
<p>Note:</p>
<p>*1 This limit applies also to the line spliced by &lt;backslash&gt;&lt;newline&gt; deletion.  Moreover, it applies to the line after converting a comment into a space and possibly concatenated multiple logical lines by a comment spreading across the lines.</p>
<p>*2 CygWIN 1.3.10 and 1.5.18 provides malloc() that has an internal routine named _malloc_r() which is called by a few other library functions.  So this malloc() cannot be replaced with other malloc().  Also in Visual C 2005 and 2008, the program terminating routine calls an internal routine of resident malloc(), hence other malloc() cannot be used.</p>
<br>

<h1><a name="5" href="#toc.5">5. Diagnostic Messages</a></h1>

<h2><a name="5.1" href="#toc.5.1">5.1. Diagnostic Messages Format</a></h2>
<p>This section covers diagnostic messages issued by <b>mcpp</b>, as well as their meaning.  By default, these messages are output to stderr.  With the -Q option, they are redirected to the mcpp.err file in the current directory.  A diagnostic message is output in the following manner:</p>
<ol>
<li>"filename: line: " is followed by "fatal error: ", "error: " or "warning: " and then by any of the diagnostic messages shown in sections 5.3 to 5.9.  Although the specification that a diagnostic message has to fit in one line that begins with "filename: line:" seems to lack of flexibility, I followed because it is a traditional way of implementing messages in C on UNIX and because various tools have already assumed that.  Some <b>mcpp</b> messages do not fit in a line of usual terminal.<br>
<br>
<li>If an error occurs during macro expansion, the macro invocation is displayed.  For nested macro invocation, <b>mcpp</b> shows each macro names and its definitions, as well as the source filename and line number where the macro is defined.<br>
<br>
<li>The source file name, the line number and the line at which an error has occurred are displayed.  If an error has occurred in an included file, the names, line numbers and the #include lines of all the including files are displayed.  Usually, a logical line with comments replaced with a space character is displayed.  The logical line is constructed from one or more physical lines with '\' at the line end.  If a comment spreads over several lines, several logical lines are concatenated into one, which is displayed as the line.  In this case, the line number of the last concatenated physical line is displayed.  Note that if an error occurs during the translation phase before processing a comment, the line in the phase is displayed.<br>
</ol>
<p>If the -j option is specified, <b>mcpp</b> outputs neither the above 2 nor 3.</p>
<p>Diagnostic messages are divided into three levels:</p>
<blockquote>
<table>
  <tr><th>fatal error</th><td>Indicates an error is so serious that it is no longer meaningful to continue preprocessing.</td></tr>
  <tr><th>error      </th><td>Indicates there is a syntax or usage error.</td></tr>
  <tr><th>warning    </th><td>Indicates code lacks of portability or may contain a bug.</td></tr>
</table>
</blockquote>
<p>Warnings are further divided into five classes:</p>
<blockquote>
<table>
  <tr><th>Class 1 </th><td>Source code may contain a bug or at least lack portability.</td></tr>
  <tr><th>Class 2 </th><td>Probably, source code will present no problem in practical use, but is problematic in terms of Standard conformance.</td></tr>
  <tr><th>Class 4 </th><td>Probably, source code will present no problem in practical use, but is problematic in terms of portability.</td></tr>
  <tr><th>Class 8 </th><td>Rather surplus warnings to #if groups skipped, sub-expression in #if expression whose evaluation is skipped, and etc.</td></tr>
  <tr><th>Class 16</th><td>Warning to trigraphs and digraphs.</td></tr>
</table>
</blockquote>
<p>Warnings other than Class 1 or 2 are rather specific to <b>mcpp</b>.</p>
<p><b>mcpp</b> has various types of diagnostic messages.  For example, <i>STD</i> mode provides the following types of diagnostics for each level and class.</p>
<blockquote>
<table>
  <tr><th>fatal error     </th><td>17 types</td></tr>
  <tr><th>error           </th><td>76 types</td></tr>
  <tr><th>warning class 1 </th><td>49 types</td></tr>
  <tr><th>warning class 2 </th><td>15 types</td></tr>
  <tr><th>warning class 4 </th><td>17 types</td></tr>
  <tr><th>warning class 8 </th><td>30 types</td></tr>
  <tr><th>warning class 16</th><td> 2 types</td></tr>
</table>
</blockquote>
<p>Principally, these messages point the coding in question.  The diagnostic messages below have a sample value embedded in a token or a numeric value from source code.  For the messages with a macro name embedded, a value the macro is expanded into is shown in real messages.</p>
<p>Depending on cases, a same message is issued as warning or error, in which case, this manual gives the first occurrence a detailed description.  For the subsequent occurrences, the message is only listed.</p>
<br>

<h2><a name="5.2" href="#toc.5.2">5.2. Translation Limits</a></h2>
<p>Of all the errors shown below, some errors, such as a buffer overflow, occur due to <b>mcpp</b> specification restrictions.  Some macros in system.H define translation limits, such as a buffer size.  Enlarge the buffer size and recompile <b>mcpp</b> if necessary, however, be careful not to increase it too much.  A large buffer in a system with a small amount of memory may cause an "out of memory" error frequently.</p>
<br>

<h2><a name="5.3" href="#toc.5.3">5.3. Fatal Errors</a></h2>
<p>A fatal error occurs and preprocessing is terminated when it is no longer possible to continue preprocessing due to an I/O error or a shortage of memory, or it is no longer meaningful to do so due to a buffer overflow.  A status value of failure is returned to a parent process.</p>

<h3><a name="5.3.1" href="#toc.5.3.1">5.3.1. <b>mcpp</b>'s Own Bugs</a></h3>
<ul>
<li><samp>Bug:</samp><br>
This message has several types.  Should it be issued, it would indicate <b>mcpp</b>'s own bug.  I think this message is rarely issued, but should it be issued, do not hesitate to let me know the situation.<br>
</ul>

<h3><a name="5.3.2" href="#toc.5.3.2">5.3.2. Physical Errors</a></h3>
<ul>
<li><samp>File read error</samp><br>
An error has occurred during reading a source file.  Disk or file-system may have been damaged.<br>
<br>
<li><samp>File write error</samp><br>
An error has occurred during writing to a file.  Disk or file-system may have been damaged or full.<br>
<br>
<li><samp>Out of memory (required size is 0x1234 bytes)</samp><br>
Runs short of memory.  <b>mcpp</b> tried to obtain memory of 0x1234 bytes from the heap, but in vain.  This error occurs when there are too many long macro definitions on a system with a small amount of memory.  Divide your source file to decrease the number of macro definitions in one translation unit.<br>
</ul>

<h3><a name="5.3.3" href="#toc.5.3.3">5.3.3. Translation Limits and Internal Buffer Errors</a></h3>
<ul>
<li><samp>Too long header name "long-file-name"</samp><br>
The length of the full path name of a file to include (file name concatenated with the specified directory path) has exceeded <i>PATHMAX</i>.<br>
<br>
<li><samp>Too long source line</samp><br>
The length of a physical line in source file has exceeded <i>NBUFF</i>-2.  The source code may not be written in C.<br>
<br>
<li><samp>Too long logical line</samp><br>
The length of a logical line, which is constructed from the several physical lines with \ at the line end, has exceeded <i>NBUFF</i>-2.  This error may occur when a defined macro is too long.  The code should be written not as a macro but as a function.<br>
<br>
<li><samp>Too long line spliced by comments</samp><br>
The length of a preprocessed line with a comment replaced with a space character has exceeded <i>NBUFF</i>-2.  This error occurs when several lines are concatenated into one if a comment spreads over several lines.  Divide the comment into several parts and write each on a separate line.<br>
<br>
<li><samp>Too long token</samp><br>
A preprocessed line has a token with a length more than <i>NWORK</i>-2.  <b>mcpp</b> tries to divide the preprocessed line into a length <i>NWORK</i> that the compiler-proper can accept.  However, if a line contains a extremely long token, it sometimes fails to do so.<br>
</ul>
<p>The following four errors may also be caused by a buffer overflow at a token that is not so particularly long during macro expansion, in which case, you should divide the macro invocation.</p>
<ul>
<li><samp>Too long quotation "long-string"</samp><br>
A string literal, character constant or header-name is too long.  In case of a string literal, divide it.  Standard conforming compiler concatenate adjacent string literals for you.<br>
<br>
<li><samp>Too long pp-number token "1234567890toolong"</samp><br>
A preprocessing-number token is too long.  This error is issued in Standard mode.<br>
<br>
<li><samp>Too long number token "12345678901234......"</samp><br>
A number token is too long.  This error is issued in pre-Standard mode.<br>
<br>
<li><samp>Buffer overflow scanning token "token"</samp><br>
A buffer overflow has occurred during token scan.  This message is issued to tokens other than string literals, character constants, header-names and pp-numbers.<br>
<br>
<li><samp>More than <i>BLK_NEST</i> nesting of #if (#ifdef) sections</samp><br>
The depth of nested #if, #ifdef, and #ifndef has exceeded <i>BLK_NEST</i>. (In real message, the macro name <i>BLK_NEST</i> is replaced with an actual numerical value. This is applied to all the messages below with a macro name embedded.)  Divide the #if section.<br>
<br>
<li><samp>More than <i>INCLUDE_NEST</i> nesting of #include</samp><br>
The depth of nested #included has exceeded <i>INCLUDE_NEST</i>.  Probably the #includes are in infinite recursion.<br>
<br>
</ul>

<h3><a name="5.3.4" href="#toc.5.3.4">5.3.4. #pragma MCPP preprocessed Related Errors</a></h3>
<ul>
<li><samp>This is not the preprocessed source</samp><br>
Although the "#pragma MCPP preprocessed" directive is found, this is not a source preprocessed by <b>mcpp</b>.<br>
<br>
<li><samp>This preprocessed file is corrupted</samp><br>
This seems to be a source preprocessed by <b>mcpp</b>, but cannot be used because it is destroyed.<br>
</ul>
<br>

<h2><a name="5.4" href="#toc.5.4">5.4. Errors</a></h2>
<p><b>mcpp</b> issues an error message when it found a grammatical error.  Standard C stipulates that a compiler system should issue a diagnostic message when they encounter a violation of syntax rules or constraints.  Principally, Standard mode issues an error message to this type of violation, but sometimes issues a warning.</p>
<p><b>mcpp</b> issues an error message or warning to most of undefined items in Standard C.  However, <b>mcpp</b> issues neither an error nor a warning to the following undefined items:</p>
<ol>
<li>' or /* in a header name in the form of a string literal: <b>mcpp</b> regards them as characters, resulting in a file open error. (' or /* in a header name enclosed with &lt; and &gt; is regarded as the beginning of a character constant or a comment, resulting in some errors.)  Although how to treat \ in a header name is undefined in Standard C, <b>mcpp</b> does not check it because it may eventually cause an error when <b>mcpp</b> actually tries to open the file.  <b>mcpp</b> on Windows issues a class 2 warning to \ and converts it to /.<br>
<br>
<li>#undef defined: Although #undef-ing a name "defined" yields an undefined result, <b>mcpp</b> does not issue a message because, in the first place, <b>mcpp</b> does not allow definition of a macro named "defined", so the macro to be undefined never exists.<br>
<li>Illegal multi-byte character sequence in a comment: Although how to deal with such character sequence is undefined in Standard C, <b>mcpp</b> does not issue a message because it does no harm. (<b>mcpp</b> issues a warning to an illegal multi-byte character sequence in string literals, character constants and header names.)<br>
<br>
<li>Identifiers that begin with _ (Reserved for compiler systems): Although using these identifiers in a user program will cause an undefined result, <b>mcpp</b> does not check it because <b>mcpp</b> does not always have a means to decide whether these identifiers are used in a user program or the compiler-system.<br>
<br>
<li><tt>__STDC_ISO_10646__</tt>, <tt>__STDC_IEC_559__</tt>, and <tt>__STDC_IEC_559_COMPLEX__</tt>: Although #defining or #undef-ing these optional C99 predefined macros yields an undefined result, <b>mcpp</b> does not check it because <b>mcpp</b> does not always have a means to determine whether these macros appear in a user program or the compiler-system. (These macros are most likely to be defined in a header file of a compiler system.)<br>
<br>
<li>UCN equivalent sequence: Although it is undefined in C99 how to deal with a UCN equivalent sequence generated by deleting &lt;backslash&gt;&lt;newline&gt; during the translation phase 2 or by concatenating string literals, <b>mcpp</b> does not issue a message and regards it as a UCN.<br>
</ol>
<p>For details on what is a violation of syntax rule or constraint, undefined, unspecified or implementation-defined in Standard C preprocessing, refer to cpp-test.html.</p>
<p>Even if an error occurs, <b>mcpp</b> continues preprocessing as long as they are not fatal one.  <b>mcpp</b> shows the number of errors and returns the status of failure to the parent process when it exits.</p>

<h3><a name="5.4.1" href="#toc.5.4.1">5.4.1. Character and Token Related Errors</a></h3>
<ul>
<li><samp>Illegal control character 0x1b, skipped the character</samp><br>
A control code other than a white space character is found in a string literal, character constant, header name or comment.  <b>mcpp</b> skips it and continues preprocessing.<br>
</ul>
<p>The following several messages are all token-related errors.  For the first four, <b>mcpp</b> skips the line in question and continues preprocessing.  The first three are string literal or other token-related errors, indicating that a closing quotation mark is not found by the end of the logical line.  This type of error occurs when you write a text that does not take a form of a preprocessing-token sequence in neither a string literal nor comment, as shown below:</p>
<pre>
#error I can't understand.
</pre>
<p>As processing-tokens are not so strictly defined as C tokens in the compiler-proper, most character sequences are regarded as pp-token sequences, as long as they belong to a source character set.  Therefore, it is only this type of coding that causes a preprocessing-token error.  Pp-token errors may occur in a skipped #if group.</p>
<ul>
<li><samp>Unterminated string literal "string</samp><br>
A string literal is unterminated.  A string literal cannot spread over several logical lines.  If necessary, write a string literal on each of several lines and have the compiler concatenate them.  This error may occur during conversion into a string by a #operator, in which case the line in question is not skipped.  <b>mcpp</b> in <i>OLDPREP</i> mode does not make an unterminated string literal an error. (Instead, it regards the line end as literal end.)  Nor <b>mcpp</b> does when invoked with the -a (-lang-asm, -x assembler-with-cpp) option (it issues a warning); it regards an unterminated string literal as a literal spreading over several lines and concatenates a line with the next by inserting \n.<br>
<br>
<li><samp>Unterminated character constant 't understand.</samp><br>
A character constant is not terminated.  <b>mcpp</b> in <i>OLDPREP</i> mode and lang-asm mode does not make it an error. (Instead, it regards the line end as literal end.)<br>
<br>
<li><samp>Unterminated header name &lt;header.h</samp><br>
A header-name is not terminated.  " or ' in a header-name enclosed with &lt; and &gt; causes the above two errors, not this one.  If /* is found in a header-name enclosed with &lt; and &gt;, <b>mcpp</b> regards it and the following text as a comment.<br>
<br>
<li><samp>Empty character constant ''</samp><br>
A character constant is empty.
In lang-asm mode, <b>mcpp</b> only issues a warning, but an error in other modes.<br>
<br>
<li><samp>Illegal UCN sequence</samp><br>
<b>mcpp</b> in <i>STD</i> mode invoked with <tt>__STDC_VERSION__</tt> set to 199901L or in C++ mode can recognizes UCN.  This message is issued when the number of orders of a hex sequence that begins with \u and \U in an identifier is less than four and eight, respectively.  (If this occurs in a character constant in a #if expression, an undefined escape sequence warning results.  Other tokens are not checked by <b>mcpp</b>.)<br>
<br>
<li><samp>UCN cannot specify the value "0000007f"</samp><br>
UCN cannot specify a hex value in the ranges of 0 to 9f, except for 0x24 ($), 0x40 (@) and 0x60 (`), and of d800 to dfff.  The former range agrees with the range of the basic source character set.  The latter range falls into the reserved area for special characters.  Note C++ does not have the latter restriction. (Specifications slightly differ among Standards for an unknown reason.)  However, when <b>mcpp</b> in <i>STD</i> mode is invoked as C++ with -V199901L to preset the <tt>__cplusplus</tt> macro to 199901L or higher, <b>mcpp</b> behaves in accordance with the C99 specifications in this respect.<br>
<br>
<li><samp>Illegal multi-byte character sequence "XY"</samp><br>
<b>mcpp</b> in <i>STD</i> mode compiled with <i>OK_MBIDENT</i> == <i>TRUE</i> allows for a multi-byte character in an identifier in C99, however, it will cause an error when it finds a character sequence that cannot be regarded as a multi-byte character although the first byte of the sequence is that of a multi-byte character. (In other than identifier, this illegal sequence causes a warning.)<br>
</ul>

<h3><a name="5.4.2" href="#toc.5.4.2">5.4.2. Unterminated Source File Related Errors</a></h3>
<p>This section covers messages issued when a source file ends with an unterminated #if section or macro invocation.  If the file (not included file) marks the end of input, the message "End of input", not "End of file", is issued.</p>
<p>These diagnostic messages are issued as an error or warning, depending on <b>mcpp</b> modes.</p>
<p>Standard mode issues these messages as error, in which case <b>mcpp</b> skips the macro invocation in question and restores relationship between paired directives in a #if section to that of when the file is initially included.</p>
<p>On the other hand, pre-Standard mode issues warnings.  <i>OLDPREP</i> mode does not even issue warning except on unterminated macro call.</p>
<ul>
<li><samp>End of file within #if (#ifdef) section started at line 123</samp><br>
#if (#ifdef or #ifndef) on the line 123 does not have a corresponding #endif.<br>
<br>
<li><samp>End of file within macro invocation started at line 123</samp><br>
A macro invocation that begins at the line 123 is not terminated by the end of the file.  This error may occur when an argument has an ill-balanced parenthesis, or when a token error occurs between opening and closing parentheses, in which case, <b>mcpp</b> continues to read tokens for a corresponding parenthesis until it reaches to the file end. (Possibly, a buffer overflow may occur before reaching there.)  In addition, since macro expansion specifications vary among modes, a macro that is successfully expanded in a mode may not in other modes.<br>
</ul>

<h3><a name="5.4.3" href="#toc.5.4.3">5.4.3. Ill-Balanced Preprocessing Group Related Errors</a></h3>
<p>This section covers errors caused by ill balanced directives of #if, #else and etc.  Even if <b>mcpp</b> finds ill balance among these directives, it continues processing, assuming that the processing group so far still continues.  <b>mcpp</b> checks to see whether directives are balanced even in a skipped #if group.</p>
<p>The #if (#ifdef) section is a block between #if (#ifdef or #ifndef) and #endif.  The #if (#elif, #else) group is a smaller block, say, between #if (#ifdef or #ifndef) and #elif, between #elif and #else, or between #else and #endif within the #if (#ifdef) section.</p>
<ul>
<li><samp>Already seen #else at line 123</samp><br>
Another #else (#elif) is found after #else at the line 123.  #endif may be missing.<br>
<br>
<li><samp>Not in a #if (#ifdef) section</samp><br>
#else (#elif, #endif) is found without #if (#ifdef or #ifndef).<br>
<br>
<li><samp>Not in a #if (#ifdef) section in a source file</samp><br>
An included file has #else (#elif or #endif) without #if (#ifdef or #ifndef).  If the included file in question had been in the including source file, this error would never have occurred.  In other words, each of these directives contained in a separate file is not balanced by itself.  The only Standard mode issues this error. (pre-Standard mode issues a warning.)<br>
</ul>
<p>The following two errors occur when #asm and #endasm are not balanced.  These messages are issued only by compiler-specific-build for a particular compiler system and in pre-Standard mode.</p>
<ul>
<li><samp>In #asm block started at line 123</samp><br>
A #asm block that begins at the line 123 has another #asm.  #asm cannot be nested.  Maybe, the programmer forgot to write #endasm.<br>
<br>
<li><samp>Without #asm</samp><br>
#endasm is found in a non #asm block.<br>
</ul>

<h3><a name="5.4.4" href="#toc.5.4.4">5.4.4. Simple Syntax Errors on Directive Lines</a></h3>
<p>This section covers simple syntax errors on directive lines that begin with #.  The errors hereinafter discussed until 5.4.12 do not occur within a skipped #if group. (<b>mcpp</b> invoked with the -W8 option issues a warning to an unknown directive.)</p>
<p>When <b>mcpp</b> finds a directive line with a syntax error, it ignores the line and continues processing, in which case, it neither regards #if as the beginning of a section nor changes line numbers even with a #line.  If a #include or #line line has a macro argument, Standard mode expands the macro and checks the syntax.  Pre-Standard mode does not expand the macro.</p>
<p>Although the messages below do not show the directive name in question, the source line that follows the message show it. (A directive line with a comment converted to a space character always becomes one line, which is called "preprocessed line" here.)</p>
<ul>
<li><samp>Illegal #directive "123"</samp><br>
A token that immediately follows # is not a name.  The token must be a directive name. (<i>OLDPREP</i> mode regards #123 as #line 123.)<br>
<br>
<li><samp>Unknown #directive "pseudo-directive"</samp><br>
The directive "pseudo-directive" is not implemented.  <b>mcpp</b> invoked with the -a (-lang-asm or -x assembler-with-cpp) option issues a warning, not an error.<br>
<br>
<li><samp>No argument</samp><br>
#if, #elif, #ifdef, #ifndef, #assert or #line has no arguments.<br>
<br>
<li><samp>No header name</samp><br>
A #include line does not have an argument, or expansion of a macro argument of a #include line results in no token.<br>
<br>
<li><samp>Not a header name "UNDEFINED_MACRO"</samp><br>
The specified argument is not a header name.  This message is issued when a macro that should define a header name is not defined.  A header name must be enclosed with &lt; and &gt;, or ", ".<br>
<br>
<li><samp>Not an identifier "123"</samp><br>
#ifdef, #ifndef, #define or #undef requires an identifier as an argument, but 123 is not an identifier.<br>
<br>
<li><samp>No identifier</samp><br>
#define or #undef does not have an argument.<br>
<br>
<li><samp>No line number</samp><br>
#line has a macro argument, but its expansion has resulted in no token.<br>
<br>
<li><samp>Not a line number "name"</samp><br>
The first argument of a #line is not a numeric token (preprocessing number).<br>
<br>
<li><samp>Line number "0x123" isn't a decimal digits sequence</samp><br>
The first argument of a #line must be a decimal integer.  Standard mode issues this message.  In pre-Standard mode, hex and octal integer tokens are allowed although a warning is issued.<br>
<br>
<li><samp>Line number "2147483648" is out of range of [1,2147483647]</samp><br>
The first argument of a #line must be within the range of 1 to 2147483647.  0 is regarded as an error.  This is applied to Standard mode.<br>
<br>
<li><samp>Not a file name "name"</samp><br>
The second argument of a #line, if any, must be a string literal.  An identifier or wide string literal is not allowed here.<br>
</ul>
<p>The following error occurs only in Standard mode and this directive is ignored.  <i>OLDPREP</i> mode issues neither an error nor a warning.  <i>KR</i> mode issues a warning and continues preprocessing as if there had been no "junk" text.</p>
<ul>
<li><samp>Excessive token sequence "junk"</samp><br>
#else, #endif, #asm, or #endasm line has a junk text, or such text follows a valid argument of #ifdef, #ifndef, #include, #line or #undef line.<br>
</ul>

<h3><a name="5.4.5" href="#toc.5.4.5">5.4.5. Syntax Errors in #if Expressions</a></h3>
<p>This section covers syntax-related errors in #if, #elif and #assert directives.  If a #if (#elif) line has these errors, <b>mcpp</b> evaluates it to false, skips the #if (#elif) group, and continues processing.</p>
<p>For a skipped #if (#ifdef, #ifndef, #elif or #else) group, <b>mcpp</b> checks validity of C preprocessing tokens and balance of these directives, but not other grammatical errors.</p>
<p>A #if line has a sub-expression whose evaluation is skipped.  For example, in case of #if a || b, if "a" is evaluated to true, "b" is not evaluated at all.  However, the following 14 types of syntax errors or translation limit errors are checked, even if they are located in a sub-expression whose evaluation is skipped.</p>
<ul>
<li><samp>More than <i>NEXP</i>*2-1 constants stacked at "12"</samp><br>
The number of constants in the stack has exceeded <i>NEXP</i>*2-1 when <b>mcpp</b> tried to evaluate "12" in a #if expression.  The depth of nested #if expressions is too deep.<br>
<br>
<li><samp>More than <i>NEXP</i>*3-1 operators and parens stacked at "+"</samp><br>
The total number of operators and parenthesis in the stack has exceeded <i>NEXP</i>*3-1 when <b>mcpp</b> tried to evaluate '+' in a #if expression. (A pair of parentheses is counted as two.)  The depth of nested #if expressions is too deep.<br>
<br>
<li><samp>Misplaced constant "12"</samp><br>
A #if expression has a constant '12' where no constant should be found.  This error occurs when casting, such as (int)0x8000, is used in a #if expression, where casting is not allowed.  In this case, (int)0x8000 is evaluated to (0)0x8000, causing this error.  The int is regarded as an identifier that is not defined as macro and is evaluated to 0.<br>
<br>
<li><samp>Operator "&gt;" in incorrect context</samp><br>
A #if expression has a &gt; operator where no &gt; should be found.  If a macro MACRO is defined as 0 token, #if MACRO &gt; 0 will be expanded to #if &gt; 0, causing this error, which is indicated by the preceding warning -- Macro "MACRO" is expanded to 0 token.<br>
<br>
<li><samp>Unterminated expression</samp><br>
A #if expression is not terminated.  This error is caused by, for example, #if a || MACRO with MACRO defined as 0 token.<br>
<br>
<li><samp>Excessive ")"</samp><br>
A #if expression has a ")" that does not corresponds to "(".<br>
<br>
<li><samp>Missing ")"</samp><br>
A #if expression does not have a ")" that corresponds to "(".<br>
<br>
<li><samp>Misplaced ":", previous operator is "+"</samp><br>
: without a corresponding ?.<br>
<br>
<li><samp>Bad defined syntax</samp><br>
A #if defined has a syntax error.  This error is caused by an unbalanced parenthesis or missing identifier in an argument.  When a macro expansion causes this error, <b>mcpp</b> displays this message followed by an expansion result.<br>
<br>
<li><samp>Can't use a string literal "string"</samp><br>
A string literal is not allowed as a constant in a #if expression.<br>
<br>
<li><samp>Can't use a character constant 'a'</samp><br>
In <i>POSTSTD</i> mode, a character constant, or a wide character constant is not allowed as a constant in a #if expression.<br>
<br>
<li><samp>Can't use the operator "++"</samp><br>
A #if expression has an illegal operator, such as = or ++.<br>
<br>
<li><samp>Not an integer "1.23"</samp><br>
Only integers, including character constants, are allowed as a constant in a #if expression.<br>
<br>
<li><samp>Can't use the character 0x24</samp><br>
A #if expression contains an illegal character (code 0x24), which is not any of the preprocessing tokens: identifiers, operators, punctuators, string literals, character constants, and preprocessing numbers. (Control codes are excluded since they had been checked before.)
Even compiler-specific-builds for compiler systems that allow $ in an identifier disallow it depending options.
Of course, this is not checked in a skipped group.<br>
</ul>
<p>The following error messages are relevant to #if sizeof.  Only pre-Standard mode issues this error.</p>
<ul>
<li><samp>sizeof: Syntax error</samp><br>
A #if sizeof has a syntax error.  This error is caused by an unbalanced parenthesis or missing arguments.<br>
<br>
<li><samp>sizeof: No type specified</samp><br>
Like sizeof(*), the "type" of #if sizeof (type) is not specified.  Note that sizeof ((*)()) is a valid syntax to determine the size of a pointer to a function.<br>
</ul>

<h3><a name="5.4.6" href="#toc.5.4.6">5.4.6. #if Expression Evaluation Errors</a></h3>
<p>The following errors do not occur in a sub-expression whose evaluation is skipped. (<b>mcpp</b> invoked with the -W8 option issues a warning.)</p>
<p>The Standards say that #if expression is evaluated by the largest integer type in C99 and long / unsigned long in C90 and in C++98.  <b>mcpp</b> evaluate it by long long / unsigned long long even if in C90 or C++98, and issues a warning on the value outside of long / unsigned long in C90 and in C++98.  In this subsection, please read the following long long / unsigned long long as long / unsigned long for the compiler without long long, and as long in pre-Standard mode.  In <i>POSTSTD</i> mode, character constant in #if expression is not available and causes a different error.</p>
<ul>
<li><samp>Constant "123456789012345678901" is out of range</samp><br>
An integer constant has a value that exceeded the range of unsigned long long.<br>
<br>
<li><samp>Integer character constant 'abcdefghi' is out of range</samp><br>
A character constant 'abcdefghi' has a value that exceeded the range of unsigned long long.<br>
<br>
<li><samp>Wide character constant L'abcde' is out of range</samp><br>
A wide character constant L'abcde' has a value that exceeded the range of unsigned long long.  This error occurs only in <i>STD</i> mode.<br>
<br>
<li><samp>8 bits can't represent escape sequence 'x123'</samp><br>
An escape sequence in a character constant has exceeded the range of 8 bits ([0, 0xFF]).<br>
<br>
<li><samp>16 bits can't represent escape sequence L'x12345'</samp><br>
An escape sequence in a wide character constant has exceeded the range of 16 bits (32 bits for UTF-8).  This error occurs only in <i>STD</i> mode.<br>
<br>
<li><samp>Division by zero</samp><br>
A #if expression contains a division by zero.  A division can be expressed using / or %.  This error may be caused by a #if dividend/divisor with the divisor not defined as a macro.  To avoid this error, "#if defined divisor &amp;&amp; (dividend/divisor ..)" is recommended.<br>
<br>
<li><samp>Result of "op" is out of range</samp><br>
An operation result using the operator op is out of range of (unsigned) long long.  Op is any of binary operators: *, /, %, +, and -.  When two's complement representation is used, the unary operator '-' will cause an overflow with -<tt>LLONG_MIN</tt>.  Unsigned long long will never cause an overflow, so it does not cause this error.  If the result of an algebraic calculation is out of range, a warning is issued.<br>
</ul>
<p>The following errors are relevant to sizeof.  They are not issued in a sub-expression whose evaluation is skipped (The -W8 option issues a warning).  Only in pre-Standard mode.</p>
<ul>
<li><samp>sizeof: Unknown type "type"</samp><br>
The "type" of #if sizeof (type) is unknown.<br>
<br>
<li><samp>sizeof: Illegal type combination with "type"</samp><br>
A type combination, like #if sizeof (long float), is invalid.<br>
</ul>

<h3><a name="5.4.7" href="#toc.5.4.7">5.4.7. #define Related Errors</a></h3>
<p>This section covers #define related errors.  A macro will not be defined if an error occurs at #define.  The # and ## operator related errors occurs in Standard mode.  <samp>__VA_ARGS__</samp> related errors also occur in Standard mode.  Although variable argument macro is a C99 specification, <b>mcpp</b> allows these macros to be used in C90 and C++ modes for compatibility with GCC and Visual C++ 2005, 2008. (A warning is issued.)</p>
<ul>
<li><samp>"defined" shouldn't be defined</samp><br>
A macro named "defined" cannot be defined.  Standard mode checks this.<br>
<br>
<li><samp>"__STDC__" shouldn't be redefined</samp><br>
The <tt>__STDC__</tt> macro cannot be #defined.  The same can be said with <tt>__STDC_VERSION__</tt>, <tt>__FILE__</tt>, <tt>__LINE__</tt>, <tt>__DATE__</tt> and <tt>__TIME__</tt> (<tt>__STDC_HOSTED__</tt> in C99 mode, and <tt>__cplusplus</tt> when <b>mcpp</b> is invoked with -+ option).  Standard mode checks these macros.<br>
<br>
<li><samp>"__VA_ARGS__" shouldn't be defined</samp><br>
C99 allows a variable argument macro with the <samp>__VA_ARGS__</samp> parameter in the replacement list, but this identifier cannot be defined as a macro.<br>
<br>
<li><samp>More than <i>NMACPARS</i> parameters</samp><br>
The number of parameters of a macro definition has exceeded <i>NMACPARS</i>.<br>
<br>
<li><samp>Empty parameter</samp><br>
A macro definition has an empty parameter.<br>
<br>
<li><samp>Illegal parameter "123"</samp><br>
A token other than an identifier is used in a parameter of a macro definition.  In Standard mode, even an identifier <samp>__VA_ARGS__</samp> cannot be used.<br>
<br>
<li><samp>Duplicate parameter name "a"</samp><br>
A macro definition has a duplicate parameter name "a".<br>
<br>
<li><samp>Missing "," or ")" in parameter list "(a,b"</samp><br>
A macro definition does not have a parenthesis ")" that closes a parameter list.  Or, a parameter is followed by neither ',' nor ')'.<br>
<br>
<li><samp>No token before ##</samp><br>
No token precedes the ## operator in the replacement list of a macro definition.<br>
<br>
<li><samp>No token after ##</samp><br>
No token follows the ## operator in the replacement list of a macro definition.<br>
<br>
<li><samp>## after ##</samp><br>
The replacement list of a macro definition has a token sequence of "## ##".  Some may do not regard this as error, but since concatenation of ## with other token always generates an invalid token, when this happens in macro expansion, it always causes an error.  <b>mcpp</b> makes it an error when it finds this in a macro definition.<br>
<br>
<li><samp>Not a formal parameter "id"</samp><br>
A function-like macro definition has a # operator whose operand id is not a parameter name.<br>
<br>
<li><samp>"..." isn't the last parameter</samp><br>
"<samp>...</samp>" must be the last parameter of a macro definition.  In pre-Standard mode, "<samp>...</samp>" causes an illegal parameter error.<br>
<br>
<li><samp>"__VA_ARGS__" without corresponding "..."</samp><br>
"<samp>__VA_ARGS__</samp>", an identifier in a replacement list, can be used only when it has a corresponding "<samp>...</samp>" parameter.<br>
</ul>
<p>In <i>STD</i> mode of GCC-specific-build, if you write GCC2-spec variadic using <samp>__VA_ARGS__</samp>, you will get this error.
<samp>__VA_ARGS__</samp> should be used only in GCC3-spec or C99-spec variadic.</p>
<ul>
<li><samp>__VA_ARGS__ cannot be used in GCC2-spec variadic macro</samp><br>
</ul>

<h3><a name="5.4.8" href="#toc.5.4.8">5.4.8. #undef Related Errors</a></h3>
<p>This section covers #undef related errors.</p>
<ul>
<li><samp>"__STDC__" shouldn't be undefined</samp><br>
The <tt>__STDC__</tt> macro cannot be #undefined. The same can be said with <tt>__STDC_VERSION__</tt>, <tt>__FILE__</tt>, <tt>__LINE__</tt>, <tt>__DATE__</tt> and <tt>__TIME__</tt> (<tt>__STDC_HOSTED__</tt> in C99 mode, and <tt>__cplusplus</tt> when invoked with -+ option).  Standard mode checks these macros.<br>
</ul>

<h3><a name="5.4.9" href="#toc.5.4.9">5.4.9. Macro Expansion Errors</a></h3>
<p>This section covers macro expansion errors.  <b>mcpp</b> displays a macro definition, as well as the source filename and line number where it is found.  The errors related to # or ## operator can occur only in Standard mode.</p>
<ul>
<li><samp>Less than necessary N argument (s) in macro call "macro( a)"</samp><br>
A macro invocation has an insufficient number of arguments.  This macro requires N number of arguments.  <b>mcpp</b> assigns a zero token to missing arguments and continues to process.  <b>mcpp</b> does not regard a macro that takes only one parameter with zero number of arguments specified as error because it cannot distinguish an empty argument from a missing argument.  <i>OLDPREP</i> mode issues a warning instead of an error on this case.<br>
<br>
<li><samp>More than necessary N argument (s) in macro call "macro( a, b, c)"</samp><br>
A macro invocation has too many arguments.  The macro should take N number of arguments.  <b>mcpp</b> ignores surplus arguments and continues processing.  In <i>OLDPREP</i> mode, a warning is issued instead of an error.<br>
<br>
<li><samp>Not a valid preprocessing token "+12"</samp><br>
The ## operator has concatenated two pp-tokens, resulting in an invalid token "+12".  The token may be separated at a later time.  Standard mode continues processing.  <i>STD</i> mode invoked with the -lang-asm (-x assembler-with-cpp, -a) option also issues a warning instead of an error.<br>
<br>
<li><samp>Not a valid string literal "\\"str\""</samp><br>
When a # operator tried to convert macro invocation's argument into a string, a token sequence of "\\"str"" has resulted, instead of a single valid string literal.  \ that precedes or follows the literal cause the error. (When <i>STD</i> mode tries to convert such an argument into a string, it may or may not cause an unterminated string literal error.)  <i>STD</i> mode tries to continue processing but maybe an error occurs again in the compilation phase.  This error can not occur in <i>POSTSTD</i> mode. (An unterminated string literal error may occur).<br>
</ul>
<p>When the following errors occur, the macro invocation will be skipped.</p>
<ul>
<li><samp>Buffer overflow expanding macro "macro" at "something"</samp><br>
A buffer overflow has occurred at "something" during macro expansion.  Divide the macro.<br>
<br>
<li><samp>Unterminated macro call "macro( a, (b, c)"</samp><br>
A macro invocation is not terminated.  This error usually occurs when a macro invocation on the directive line is not terminated at that line.  In Standard mode, a macro in an argument is expanded before argument substitution, in which case the macro invocation must be terminated in the argument.  In <i>POSTSTD</i> mode, a macro invocation unterminated in a replacement list also causes this error.<br>
<br>
<li><samp>Rescanning macro "macro" more than <i>RESCAN_LIMIT</i> times at "something"</samp><br>
The depth of nested macros is so deep that the number of rescans has exceeded <i>RESCAN_LIMIT</i> at "something" during expansion.  This error occurs only in Standard mode but it is quite rare.<br>
<br>
<li><samp>Recursive macro definition of "macro" to "macro"</samp><br>
A macro definition is recursive.  This error occurs only in pre-Standard mode.  When the number of rescans has exceeded <i>PRESTD_RESCAN_LIMIT</i>, <b>mcpp</b> regards it as a recursive macro definition.<br>
</ul>
<p>The following errors can occur only with -K option in <i>STD</i> mode.
These errors mean that the macro is extremely complex and buffer for macro notification ran short.
Almost impossible to happen in real life.</p>
<ul>
<li><samp>Too many magics nested in macro argument</samp>
<li><samp>Too many nested macros in tracing MACRO</samp>
</ul>

<h3><a name="5.4.10" href="#toc.5.4.10">5.4.10. #error and #assert</a></h3>
<ul>
<li><samp>#error</samp><br>
A #error directive has been executed.  Following this message, the #error line is displayed.  If an argument itself contains a token error, such as unterminated strings, #error is not executed.  The only Standard mode has #error.<br>
<br>
<li><samp>Preprocessing assertion failed:</samp><br>
A #assert directive has been executed.  Following this message, #assert line arguments are displayed.  If any of the arguments contains an error, <b>mcpp</b> regards assertion has failed.  The only <b>mcpp</b> of other than GCC-specific-build in pre-Standard mode has #assert.<br>
</ul>

<h3><a name="5.4.11" href="#toc.5.4.11">5.4.11. Failure of #include</a></h3>
<ul>
<li><samp>Can't open include file "file-name"</samp><br>
This error occurs when a file to include does not exist.  Probably, this is due to wrong spelling of the file name or an "include directory" should have been specified.<br>
</ul>

<h3><a name="5.4.12" href="#toc.5.4.12">5.4.12. Other Errors</a></h3>
<p>The following two are checked when <b>mcpp</b> is invoked with the -V199901L option.  The same thing can be said when <b>mcpp</b> is invoked with the -V199901L option in C++ mode.</p>
<ul>
<li><samp>Operand of _Pragma() is not a string literal</samp><br>
The _Pragma() operator must take an argument of one string literal or wide string literal.
<li><samp>_Pragma operator found in directive line</samp><br>
_Pragma() operator is an alternative to #pragma directive.
It cannot be used in a directive line.
</ul>
<br>

<h2><a name="5.5" href="#toc.5.5">5.5. Warnings (Class 1)</a></h2>
<p>A warning is issued when source, although syntactically correct, possibly contains some coding mistakes or has a portability problem.  Warnings are divided into five classes: 1, 2, 4, 8, and 16.  These classes are enabled when the -W &lt;n&gt; option is specified on <b>mcpp</b> invocation.  &lt;n&gt; specifies a ORed value of any of 1, 2, 4, 8, and 16.  Class 4, for example, can be specified explicitly with -W4, and implicitly with -W&lt;n&gt;, where &lt;n&gt; is 1|4, 1|2|4, 2|4, 1|4|8, 4|8, 4|16, etc., because the AND-ed value of &lt;n&gt; and 4 is 4 (non-0).</p>
<p>Standard mode issues an error message to most of the source code that causes a Standard C undefined behavior, but a warning to some.</p>
<p>Likewise, Standard mode always issues a warning to the source code which uses Standard C unspecified specifications, except for the following:</p>
<ol>
<li>Evaluation order of sub-expressions in a #if expression: Although the evaluation order of the operands of operators other than ||, &amp;&amp;, ? , and : is unspecified in Standard C, <b>mcpp</b> does not issue a warning because #if expression does not cause side-effects and therefore the evaluation order does not affect results.  <b>mcpp</b> always evaluates integer constant tokens from left to right in the order they appear and performs an operation in accordance with an operator grouping rule when their values are needed.
</ol>
<p>Standard mode issues a warning to many implementation-defined behaviors, except for the following:</p>
<ol>
<li>Directories the #include directive searches for a file to include and how to construct a header-name pp-token from #include's argument: <b>mcpp</b> does not issue a warning because there will be too many warnings if it actually does.  Unless a header name is a macro, the source token sequence, including spaces, are used as it is.  If it is a macro, the expanded result, including spaces, is used.  (In <i>POSTSTD</i> mode, a space character is inserted between pp-tokens during macro expansion.  A header-name is constructed by concatenating the resulting pp-tokens from &lt; to &gt; and then by removing space characters.  Anyway, in <i>POSTSTD</i>, a header-name enclosed with &lt; and &gt; is obsolete.)  When <b>mcpp</b> encounters '#pragma MCPP debug path' or '#debug path', it displays a search path, instead of issuing a warning.<br>
<br>
<li>Evaluation of a single byte character constant, such as 'a', and of a wide character constant that consists of only one multi-byte character, such as L'X', in a #if expression.: <b>mcpp</b> does not issue a warning because even with the same basic character set used, there are an unlimited number of factors that limits the portability, such as single byte Katakana, presence or absence of a sign, encoding scheme of Kanji, and etc.  The same thing can be said with UCN.<br>
<br>
<li>Bit operations using negative numbers in a #if expression: Although bit operation results depend on internal representation of an integer on a machine, most of the machines use two's complement representation, thus causing no portability problem.  However, <b>mcpp</b> issues a warning to a right bit shift operation of a negative value and a division operation involving either or both of negative operands because they lack of portability.<br>
<br>
<li>A sequence of several white space characters as a token separator: Standard C states that it is implementation-defined whether a sequence of white space characters is replaced by one space character in the translation phase 3, but you do not have to worry about this.  Portability becomes an issue only when a preprocessing directive line has &lt;vertical-tab&gt; or &lt;form-feed&gt;.  <b>mcpp</b> converts it into one space character and issues a warning.  For a sequence of several space characters and tabs, <b>mcpp</b> compresses it into one space character without a warning.<br>
<br>
<li>Compiler system's own built-in macros will not cause warning.<br>
<br>
<li>#pragma sub-directive: Principally, <b>mcpp</b> does not issue a warning to #pragma sub-directive, however, for #pragma once, #pragma __setlocale, #pragma MCPP * which <b>mcpp</b> itself processes, it issues a warning if they have an invalid argument.  In addition, <b>mcpp</b> issues a warning to GCC V.3's #pragmas, such as #pragma GCC poison (dependency), that resident preprocessor processes but <b>mcpp</b> does not.<br>
<br>
<li>Doubled \: Although it is implementation-defined in C99 whether a single \ is changed into double \ (\\) when the # operator converts a UCN sequence into a string, <b>mcpp</b> does not issue a warning to this.  <b>mcpp</b> does not double \.<br>
</ol>
<p>As you see, <b>mcpp</b> can perform almost all the portability checks necessary at a preprocessing level.</p>
<p><i>POSTSTD</i> mode is identical with <i>STD</i> mode except for some specification differences described in section <a href="#2.1"> 2.1</a>.</p>
<p>Regardless of the number of warnings, <b>mcpp</b> always returns a status of success.  <b>mcpp</b> invoked with the -W0 option does not issue a warning.</p>

<h3><a name="5.5.1" href="#toc.5.5.1">5.5.1. Character, Token and Comment Related Warnings</a></h3>
<ul>
<li><samp>Converted [CR+LF] to [LF]</samp><br>
Converted the newline code from [CR+LF] to [LF].  This warning is issued when the source file for Windows is compiled on UNIX-like systems.  This warning is issued only once, on warning level of class 1 in compiler-independent-build and class 2 in compiler-specific-build.<br>
<br>
<li><samp>Illegal control character 0x1b in quotation</samp><br>
A string literal, character constant or header name has a control code other than a white space character, which may cause an error in the compiler-proper.  This way of coding is not desirable.  A control code in string literals or character constants should be written using an escape sequence.<br>
<br>
<li><samp>Illegal multi-byte character sequence "XY" in quotation</samp><br>
The first byte (X) of "XY" in a string literal, character constant or header name is the first byte of a multi-byte character (Kanji), while the second byte (Y) is not the second byte of a multi-byte character.  "XY" may be displayed garbled.  <b>mcpp</b> does not regard "XY" as a single multi-byte character.  It treats the first byte as a single-byte character and the second byte as the next character.<br>
<br>
<b>mcpp</b> usually does not issue a warning to a character in an external character set, as long as it is within the proper range.  Even within the proper range, there are some holes (no corresponding characters).  <b>mcpp</b> does not check whether such a character is defined or not.  The following table shows the range of each multi-byte character set:<br>
<blockquote>
<table>
  <tr><th>Encoding   </th><td>first byte          </td><td>second byte</td></tr>
  <tr><th>shift-JIS  </th><td>0x81-0x9f, 0xe0-0xfc</td><td>0x40-0x7e, 0x80-0xfc</td></tr>
  <tr><th>EUC-JP     </th><td>0x8e, 0xa1-0xfe     </td><td>0xa1-0xfe</td></tr>
  <tr><th>KS C 5601  </th><td>0xa1-0xfe           </td><td>0xa1-0xfe</td></tr>
  <tr><th>GB 2312-80 </th><td>0xa1-0xfe           </td><td>0xa1-0xfe</td></tr>
  <tr><th>Big Five   </th><td>0xa1-0xfe           </td><td>0x40-0x7e, 0xa1-0xfe</td></tr>
  <tr><th>ISO-2022-JP</th><td>0x21-0x7e           </td><td>0x21-0x7e</td></tr>
</table>
</blockquote>
Beside character codes, ISO-2022-JP has a shift sequence.  Apart from the shift sequence, all the multi-byte characters other than UTF-8 are two bytes length.<br>
<br>
In UTF-8, multi-byte characters are mostly 2 or 3 bytes, and possibly 4 bytes length.
Most of kanji are encoded in 3 bytes.
The first byte is within the range of 0xc2 to 0xef, second, third and fourth 0x80 to 0xbf.
Details are omitted here.
Anyway, all the bytes must fall within these ranges.
In addition, there are some illegal sequences even in these ranges: <b>mcpp</b> warns at those sequences.<br>
<br>
Note that since <b>mcpp</b> is unable to recognize EUC-JP's three byte encoding (JIS X 0213), it regards 0x8f + 0xa1-0xfe + 0xa1-0xfe not as one character but as two characters of 0x8f and 0xa1-0xfe + 0xa1-0xfe.  As a result, <b>mcpp</b> does not issue a warning to the three byte encoding and can evaluate it correctly, except for a wide character constant in a #if expression.  In EUC-JP, a character with the first byte of 0x8e (a half-width Katakana) is encoded in two bytes, and treated as a multi-byte character.<br>
<br>
This warning is not issued in a skipped #if group.<br>
<br>
<li><samp>"/*" in comment</samp><br>
A comment has a sequence of /*.  Unless it is intended, the programmer may have forgot to enclose the comment.  Comments cannot be nested.<br>
<br>
<li><samp>Too long identifier, truncated to "very_long_identifier"</samp><br>
Since the length of an identifier has exceeded <i>IDMAX</i>, it is truncated to <i>IDMAX</i>.<br>
<br>
<li><samp>Illegal digit in octal number "089"</samp><br>
An octal numeric token contains 8 or 9.  The only pre-Standard mode issues this warning.  Standard mode does not check whether a numerical token on lines other than #if directives is correct or not.  If a #if expression has an octal numeric token of 8 or 9, it will cause a "Not an integer" error.<br>
<br>
<li><samp>Unterminated string literal, catenated to the next line</samp><br>
Although an unterminated string literal in a logical line is normally regarded as an error, <b>mcpp</b> invoked with the -lang-asm (-x assembler-with-cpp, -a) option regards it as a multi-line string literal and concatenates the line with the next by inserting '\n'.  This way of writing has no advantage.  Using a functionality to concatenate adjacent string literals is preferable.<br>
<li><samp>Unterminated character constant 't understand.</samp><br>
<b>mcpp</b> in lang-asm mode does not regard this an error.<br>
<li><samp>Empty character constant ''</samp><br>
A character constant is empty.
In lang-asm mode, <b>mcpp</b> only issues a warning.<br>
</ul>

<h3><a name="5.5.2" href="#toc.5.5.2">5.5.2. Unterminated Source File Related Warnings</a></h3>
<p>On unterminated line or comments, the following messages are issued.  <i>OLDPREP</i> mode does not issue warning.</p>
<ul>
<li><samp>End of file with no newline, supplemented newline</samp><br>
A file must be terminated with a newline.  <b>mcpp</b> supplements the line with &lt;newline&gt;.<br>
<br>
<li><samp>End of file with \, deleted the \</samp><br>
A file must not be terminated with &lt;backslash&gt;&lt;newline&gt;.  <b>mcpp</b> deletes the &lt;backslash&gt;.<br>
<br>
<li><samp>End of file with unterminated comment, terminated the comment</samp><br>
A comment is not terminated.  <b>mcpp</b> terminates the comment.<br>
</ul>
<p>The following warning messages are issued in pre-Standard mode.  Pre-Standard mode ignores these warnings to continue processing until it reaches the end of input, causing many unexpected results.  Standard mode issues an error.  <i>OLDPREP</i> mode does not issue even warning, except on unterminated macro.</p>
<ul>
<li><samp>End of file within #if (#ifdef) section starting from line 123</samp><br>
<br>
<li><samp>End of file within macro invocation starting from line 123</samp><br>
<br>
<li><samp>End of file with unterminated #asm block starting from line 123</samp><br>
#asm on the line 123 does not have a corresponding #endasm.<br>
</ul>

<h3><a name="5.5.3" href="#toc.5.5.3">5.5.3. Directive Line Related Warnings</a></h3>
<ul>
<li><samp>The macro is redefined</samp><br>
<b>mcpp</b> displays this message followed by the source filename and line number where this macro definition is found.<br>
The macro has been redefined with a different content.  Source must not be well organized.  The following conditions must be met for macro definitions with the same name to exist. Or, the new definition is regarded as a redefinition, and a warning is issued.<br>
<br>
<ol>
<li>Have the same number of parameters.<br>
<li>Have the same replacement list. (One or more white space character between tokens are regarded as one.  In <i>POSTSTD</i>, the difference of the token separators does not matter because any number of space characters is changed into one, regardless of the presence or absence of the token separators.)<br>
<li>In Standard mode, parameter names must be the same.  In <i>POSTSTD</i> mode and in pre-Standard mode, they are not checked.<br>
<br>
</ol>
<li><samp>Unknown argument "name"</samp><br>
There is no such an argument of '#pragma MCPP debug' or '#debug' as "name".<br>
<li><samp>No argument</samp><br>
A '#pragma MCPP debug' or '#debug' does not have an argument.<br>
<li><samp>Not an identifier "123"</samp><br>
The argument of a '#pragma MCPP debug' or '#debug' is not an identifier.<br>
</ul>
<p>The following message is issued only in Standard mode.</p>
<ul>
<li><samp>"and" is defined as macro</samp><br>
"and" is defined as a macro in C++.<br>
Whereas in C95 "and" or other 11 names are defined as macros by &lt;iso646.h&gt;, those are operator tokens in C++,<br>
</ul>
<p>The following message is issued only in <i>STD</i> mode.</p>
<ul>
<li><samp>No space between macro name "MACRO" and repl-text</samp><br>
There is no space between macro name and replacement list of a #define line.  Normally, this does not happen, but it does happens when an illegal character is used in a macro name as follows:<br>
<pre>
#define THIS$AND$THAT(a, b)   ((a) + (b))
</pre>
<b>mcpp</b> interprets this as follows:<br>
<pre>
#define THIS  $AND$THAT(a, b) ((a) + (b))
</pre>
and issues a warning.  Of course, this is a quite rare case.<br>
</ul>
<p>The following warnings are issued only in lang-asm mode.</p>
<ul>
<li><samp>Illegal #directive "+"</samp>
<li><samp>Unknown #directive "pseudo-directive"</samp><br>
Though these are usually errors, lang-asm mode issues only warnings.<br>
</ul>
<p>The following warnings on #pragma line are issued only in Standard mode.
The lines are outputted in principle in spite of the warnings.
However, the lines to be processed by preprocessor such as most of the lines starting with #pragma MCPP or #pragma GCC are not outputted.
The pragmas for compiler or linker such as <samp>#pragma GCC visibility *</samp> are outputted without warning.</p>
<ul>
<li><samp>No sub-directive</samp><br>
A #pragma line does not have any argument.  The line is ignored.<br>
<br>
<li><samp>Unknown encoding "encoding"</samp><br>
The encoding name, "encoding", specified with #pragma __setlocale( "encoding") is not implemented.  For the encoding name, refer to <a href="#2.8"> 2.8</a>.<br>
<br>
<li><samp>Too long encoding name "encoding"</samp><br>
The encoding name, "long-long-encoding", specified with #pragma __setlocale( "long-long-encoding") exceeds 19 bytes.  <b>mcpp</b> ignores it.<br>
<br>
<li><samp>Bad push_macro syntax</samp><br>
<li><samp>Bad pop_macro syntax</samp><br>
There is a syntax error in #pragma MCPP push_macro, #pragma MCPP pop_macro, #pragma push_macro or #pragma pop_macro.  To use these #pragma directives, first enclose a macro name in an argument with ", " and then further enclose with ( ).  For example, ("MACRO"). (A redundant specification for compatibility with Visual C.)<br>
<br>
<li><samp>"MACRO" has not been defined</samp><br>
MACRO in the argument,("MACRO"), for #pragma MCPP push_macro, #pragma MCPP pop_macro, #pragma push_macro, or #pragma pop_macro is not defined as a macro.<br>
<li><samp>"MACRO" is already pushed</samp><br>
MACRO of #pragma MCPP push_macro ("MACRO") has been pushed and then further #undef-ed.  Without redefining the MACRO, push would not be possible.<br>
<li><samp>"MACRO" has not been pushed</samp><br>
MACRO in #pragma MCPP pop_macro( "MACRO") has not been pushed.  It may have been already popped.<br>
</ul>
<p>The GCC-specific-build issues the following warnings:</p>
<ul>
<li><samp>Ignored #ident</samp><br>
<li><samp>Ignored #sccs</samp><br>
#ident or #sccs lines are ignored.<br>
</ul>
<p>GCC-specific-build issues a Class 2 warning to a line with #pragma GCC followed by either poison or dependency and does not output the line.  GCC V.3 resident preprocessor process the line but <b>mcpp</b> does not.</p>
<p>The following warnings are issued only in pre-Standard mode.  Standard mode regards them as errors.</p>
<ul>
<li><samp>Not in a #if (#ifdef) section in a source file</samp><br>
<li><samp>Line number "0x123" isn't a decimal digit sequence</samp><br>
</ul>
<p><i>KR</i> mode issues the following warning.  Standard mode issues the same warning only to #pragma once, #pragma MCPP put_defines, #pragma MCPP push_macro, #pragma MCPP pop_macro, #pragma push_macro, #pragma pop_macro, #pragma MCPP debug, #pragma MCPP end_debug, and #endif for GCC-specific-build on <i>STD</i> mode; for other directives, Standard mode issues an error.  <i>OLDPREP</i> mode issues neither an error nor a warning.</p>
<ul>
<li><samp>Excessive token sequence "junk"</samp><br>
</ul>

<h3><a name="5.5.4" href="#toc.5.5.4">5.5.4. #if Expression Related Warnings</a></h3>
<p>The following three warnings are relevant to an argument of #if, #elif, or #assert:</p>
<ul>
<li><samp>Macro "MACRO" is expanded to "defined"</samp><br>
The macro MACRO in a #if expression has been expanded to "defined".  <b>mcpp</b> treats this strange macro not as identifier but as operator.  How to treat it is undefined in Standard C.<br>
<br>
<li><samp>Macro "MACRO" is expanded to "sizeof"</samp><br>
The macro MACRO in a #if expression has been expanded to sizeof.  <b>mcpp</b> treats this strange macro not as identifier but as operator.  Pre-Standard mode issues this warning.<br>
<br>
<li><samp>Macro "MACRO" is expanded to 0 token</samp><br>
The macro MACRO has been expanded to zero token.  If this happens in a #if expression, it almost always causes an error.  The purpose of this warning is to indicate the cause of an error.<br>
</ul>
<p>The followings warnings are also relevant to an argument of #if, #elif or #assert.  They are not issued in a sub-expression whose evaluation is skipped. (<b>mcpp</b> invoked with the -W8 option issues them.)</p>
<ul>
<li><samp>Undefined escape sequence '\x'</samp><br>
There is no such escape sequence as \x.  \x is evaluated to a two byte sequence. (Of course, an escape sequence of "\x" followed by a hex string is valid.)  This warning is also issued to a UCN with an insufficient number of orders.<br>
</ul>
<p>The followings warnings are relevant to operations and types in a constant expression on #if, #elif or #assert lines.  No warnings are also issued in a skipped sub-expression. (<b>mcpp</b> invoked with -W8 issues them.)</p>
<p><b>mcpp</b> evaluate #if expression by long long / unsigned long long even if in C90 or C++98, and issues a warning on the value outside of long / unsigned long in C90 and in C++98.  Also on LL suffix in other than C99, <b>mcpp</b> issues a warning.  These warnings are of class 1 in compiler-independent-build and class 2 in compiler-specific-build.  In <i>POSTSTD</i> mode, character constants are not used in #if expression, hence no warning is issued. (Those make errors.)</p>
<ul>
<li><samp>Constant "123456789012" is out of range of (unsigned) long</samp><br>
An integer constant has a value that exceeded the range of unsigned long.<br>
<li><samp>Integer character constant 'abcde' is out of range of unsigned long</samp><br>
A character constant 'abcde' has a value that exceeded the range of unsigned long.<br>
<li><samp>Wide character constant L'abc' is out of range of unsigned long</samp><br>
A wide character constant L'abc' has a value that exceeded the range of unsigned long.  This error occurs only in <i>STD</i> mode.<br>
<li><samp>Result of "op" is out of range of (unsigned) long</samp><br>
An operation result using the operator op is out of range of (unsigned) long.  Op is any of binary operators: *, /, %, +, and -.  When two's complement representation is used, the unary operator '-' will cause an overflow with -<tt>LONG_MIN</tt>.  Unsigned long will never cause an overflow, so it does not cause an error.  If the result of an algebraic calculation is out of range, a warning is issued.<br>
<li><samp>LL suffix is used in other than C99 mode "123LL"</samp><br>
LL suffix is used for an integer in other than C99 mode.<br>
<li><samp>Shift count "40" is larger than bit count of long</samp><br>
The value of the right operand of a bit shift operator, &lt;&lt; or &gt;&gt;, exceeds the bit count of long.<br>
<br>
<li><samp>Negative value "-1" is converted to positive "18446744073709551615"</samp><br>
A mixture of signed and unsigned operations results in conversion of a signed negative value into an unsigned positive value.  This is not an error, but indicates source code may contain a bug.  For both operands of a binary operator, such as *, /, %, +, -, &lt;, &gt;, &lt;=, &gt;=, ==, !=, &amp;, ^ and | , and the second and third operands of a ternary operator, ? and :, if one operand is unsigned and the other is signed, the signed one is always converted into unsigned.<br>
<br>
<li><samp>Illegal shift count "-1"</samp><br>
The value of the right operand of a bit shift operator, &lt;&lt; or &gt;&gt;, is a negative number or exceeds the bit count of long long.  Probably, this is also a bug in source code.<br>
<br>
<li><samp>"op" of negative number isn't portable</samp><br>
If an operation using a binary operator (op) results in either or both of negative operands, it lacks of portability. "Op" is any of /, %, and &gt;&gt;.  The &gt;&gt; operator with a negative left operand provides portability across compiler systems on computers having an arithmetic shift command, where a one-bit shift means a division by 2.  Otherwise, it does not provide portability.<br>
</ul>

<h3><a name="5.5.5" href="#toc.5.5.5">5.5.5. Macro Expansion Related Warnings</a></h3>
<p>In these warnings, <b>mcpp</b> displays a macro definition followed by the source filename and line number where the macro is defined.</p>
<ul>
<li><samp>Macro started at line 123 swallowed directive-like line</samp><br>
<b>mcpp</b> has read a line that begins with # as an argument of the macro that begins at the line 123.  Maybe, the macro invocation has a bug.  If it had not been for the macro, the line that begins with # would have been interpreted as a directive line.  The same thing could be said if the macro had been located in a #if group whose evaluation is skipped, and the line is treated as a directive, because such macro is never expanded.<br>
<br>
<li><samp>Replacement text "sub(" of macro "head" involved subsequent text</samp><br>
Rescanning of the replacement list "sub(" of the macro "head" has involved the text succeeding the macro invocation.  K&amp;R 1st to Standard C did not regard this as an error, but if you used this type of macro without having these standards in mind to receive this warning, your macro definition or macro invocation is probably not correct.  If you are intended to use such macro, it is an unusual macro.  The only <i>STD</i> mode issues this warning.  <i>COMPAT</i> mode issues this warning only on class 8.  In pre-Standard mode, the same situation may arise but no warning is issued.  <i>POSTSTD</i> mode never issues this warning because rescanning does not involve the text succeeding the replacement list. (A macro may be expanded quite differently or causes an "unterminated macro call" error.)<br>
<br>
<li><samp>Less than necessary N argument (s) in macro call "macro( a)"</samp><br>
An insufficient number of arguments of a macro invocation.  Normally, this causes an error, but in case of missing only one argument of a macro that takes a variable number of arguments, <b>mcpp</b> issues a warning.  This is to decrease migration problems of variable argument macros between GCC and C99.<br>
<br>
<li><samp>Removed ',' preceding the absent variable argument</samp><br>
Since variable argument is absent and the replacement list has a sequence <samp>", ## __VA_ARGS__"</samp>, removed the comma immediately preceding <samp>"## __VA_ARGS__"</samp>.
This warning is issued only in Standard mode of GCC-specific-build.<br>
<br>
<li><samp>Old style predefined macro "linux" is used</samp><br>
The non-conforming predefined macro without leading '_' is used.
This warning is only on Standard mode of GCC-specific-build.<br>
</ul>
<p>The following two are issued only in <i>OLDPREP</i> mode. (In other mode it causes an error.)
<ul>
<li><samp>Less than necessary N argument(s) in macro call "macro( a)"</samp><br><li><samp>More than necessary N argument(s) in macro call "macro( a, b, c)"</samp><br>
</ul>

<h3><a name="5.5.6" href="#toc.5.5.6">5.5.6. Line Number Related Warnings</a></h3>
<p>This section covers line number related warnings.</p>
<ul>
<li><samp>Line number "32768" is out of range of [1,32767]</samp><br>
In C90 and C++, the first argument of a #line must be within the range of 1 to 32767.  0 is also out of range.  With <tt>__STDC_VERSION__</tt> &gt;= 199901L or <tt>__cplusplus</tt> &gt;= 199901L, the valid range is 1 to 2147483647.  Therefore, in C90 or C++ mode, <b>mcpp</b> issues a warning, not an error, to the range of 32768 to 2147483647.  Standard mode issues this warning.<br>
</ul>
<p>In C90, when you use #line to specify a value slightly below 32767, you won't receive an error, but sooner or later, the line number will exceed 32767, in which case, <b>mcpp</b> continues to increase the line number while issuing a warning.  Some compiler-proper may not accept this large line number.  It is not desirable to specify a large number with #line.</p>
<ul>
<li><samp>Line number 32768 got beyond range</samp><br>
The source line number has reached 32768, at which a warning is issued one time.<br>
<br>
<li><samp>Line number 32769 is out of range</samp><br>
When the <tt>__LINE__</tt> macro is expanded, the lime number has exceeded 32767.<br>
</ul>

<h3><a name="5.5.7" href="#toc.5.5.7">5.5.7. #pragma MCPP warning, #warning</a></h3>
<ul>
<li><samp>#warning</samp><br>
<li><samp>#pragma MCPP warning</samp><br>
A #pragma MCPP warning (#warning) directive has been executed.  Following the above message, the line is displayed. (If an argument of #pragma MCPP warning has a token error, such as unterminated string, #pragma MCPP warning is not executed.)  Although this directive appears in the Warning Level 1 section, this warning is issued at every warning level.  Standard mode has #pragma MCPP warning, while pre-Standard mode has #warning.<br>
</ul>
<br>

<h2><a name="5.6" href="#toc.5.6">5.6. Warnings (Class 2)</a></h2>
<p>This section covers warnings to code that does not contains a bug but causes a portability problem.</p>
<ul>
<li><samp>Converted [CR+LF] to [LF]</samp><br>
Converted the newline code from [CR+LF] to [LF].  This warning is issued on warning level of class 1 in compiler-independent-build and class 2 in compiler-specific-build.<br>
</ul>
<p><b>mcpp</b> evaluate #if expression by long long / unsigned long long even if in C90 or C++98, and issues a warning on the value outside of long / unsigned long in C90 and in C++98.  Also LL suffix in other than C99 mode gets a warning as well as i64 suffix of compiler-specific-builds for Visual C and Borland C.  These warnings are of class 1 in compiler-independent-build and class 2 in compiler-specific-build.</p>
<ul>
<li><samp>Constant "123456789012" is out of range of (unsigned) long</samp><br>
<li><samp>Integer character constant 'abcde' is out of range of unsigned long</samp><br>
<li><samp>Wide character constant L'abc' is out of range of unsigned long</samp><br>
<li><samp>Result of "op" is out of range of (unsigned) long</samp><br>
<li><samp>LL suffix is used in other than C99 mode "123LL"</samp><br>
<li><samp>I64 suffix is used in other than C99 mode "123i64"</samp><br>
<li><samp>Shift count "40" is larger than bit count of long</samp><br>
</ul>
<p>Only the Standard mode issues the following five warnings:</p>
<ul>
<li><samp>Parsed "//" as comment</samp><br>
A text from // to the end of the line is interpreted as a comment.  This is a legal notation in C99 and C++.  In C90 mode <b>mcpp</b> treats it as a comment after issuing a warning.<br>
<br>
<li><samp>Variable argument macro is defined</samp><br>
Although it is the C99 Standard that stipulates variable argument macros, a variable argument macro has been defined in C90 or C++ mode.<br>
<br>
<li><samp>Empty argument in macro call "MACRO( a, ,"</samp><br>
A macro invocation has an empty argument, in which case, <b>mcpp</b> regards the argument as zero number of pp-token sequences and treats it as such.  The empty argument is legal in C99, while it is undefined in C90, thus causing a lack of portability.  (<b>mcpp</b> regards an empty argument even without a ',' not as an empty argument, but as a missing argument, thus issuing an error.  Since zero number of arguments and one empty argument is syntactically indistinguishable, <b>mcpp</b> does not make both an error.)  Writing an empty argument in source code is not generally preferable.  I recommend that you should code:<br>
<pre>
#define EMPTY
</pre>
, if possible, and then write <tt>EMPTY</tt> where an empty argument is written.<br>
<br>
<li><samp>Skipped the #pragma line</samp><br>
GCC V.3 provides several #pragma directives in the form of #pragma GCC &lt;args&gt;.  Its preprocessor processes some of them, but <b>mcpp</b> does not support #pragma GCC poison and #pragma GCC dependency.  This warning is issued to a #pragma directive compiler-specific preprocessors process but <b>mcpp</b> does not.<br>
<br>
<li><samp>Not a valid preprocessing token "+12"</samp><br>
Concatenating two pp-tokens with the ## operator results in an invalid token "+12", which normally causes an error.  However, <b>mcpp</b> invoked with the -lang-asm  (-x assembler-with-cpp, -a) option does not regard it as an error.<br>
</ul>
<p>The following warning is issued only in <i>POSTSTD</i> mode.</p>
<ul>
<li><samp>Header-name enclosed by &lt;, &gt; is an obsolescent feature &lt;stdio.h&gt;</samp><br>
The header name in the form of &lt;stdio.h&gt; is one of the specifications I want to abolish.  I recommend to use "stdio.h".<br>
</ul>
<p>The following two warnings are issued only in some compiler systems.  Of course, the coding in question is valid in those particular systems, but it lacks of portability, so a warning is issued to remind users of it.</p>
<ul>
<li><samp>#include_next is not allowed by Standard</samp><br>
<li><samp>#warning is not allowed by Standard</samp><br>
These directives are valid in GCC but not Standard C-conforming and lack of portability.<br>
<li><samp>GCC2-spec variadic macro is defined</samp><br>
Though GCC2-spec variadic macro is valid in GCC-specific-build, this macro is not portable.<br>
<br>
<li><samp>Converted  to /</samp><br>
A #include directive contains \ in the header name.  <b>mcpp</b> converts \ into /.  "\" is a valid path delimiter in OSs, such as Windows, but undefined in Standard C.  It is safe to use /.  <b>mcpp</b> on Windows issues this warning only once. (<b>mcpp</b> does not regard " preceded by \ as a delimiter of a string literal, raising an "unterminated string literal" error.)<br>
<br>
<li><samp>'$' in identifier "THIS$AND$THAT"</samp><br>
An identifier has a '$'.  The only <b>mcpp</b> compiled with <i>DOLLAR_IN_NAME</i> set to <i>TRUE</i> issues this warning only once because '$' lacks of portability although it is valid in this <b>mcpp</b>.  '$' being regarded as a pp-token, other <b>mcpp</b> parses THIS$AND$THAT into five components THIS,  $,  AND, $ and THAT, resulting in a compiler error.<br>
</ul>
<br>

<h2><a name="5.7" href="#toc.5.7">5.7. Warnings (Class 4)</a></h2>
<p>Standard C guarantees some minimum translation limits.  It is desirable that a preprocessor imposes translation limits that exceed these values, but source programs that uses preprocessor' own translation limits will restrict portability.  <b>mcpp</b> provides some macros in "system.H" that allows you to set translation limits to any values you like.  <b>mcpp</b> in Standard mode issues a warning to source code that exceeds a Standard C guaranteed limit.  However, these messages are excluded from Class 1 and 2 because they may be issued frequently, depending on standard headers of compiler systems or source programs.</p>
<ul>
<li><samp>Logical source line longer than 509 bytes</samp><br>
The length of a logical source line has exceeded 509 bytes.<br>
<br>
<li><samp>Quotation longer than 509 bytes "very_very_long_string"</samp><br>
The length of a string literal, character constant or header name has exceeded 509 bytes.<br>
<br>
<li><samp>More than 8 nesting of #include</samp><br>
The depth of nested #includes has exceeded 8.  This warning is issued only when it reaches 9.<br>
<br>
<li><samp>More than 8 nesting of #if (#ifdef) sections</samp><br>
The depth of nested #ifs, #ifdefs, or #ifndefs has exceeded 8.  This warning is issued only when it reaches 9.<br>
<br>
<li><samp>More than 1024 macros defined</samp><br>
The number of defined macros has reached 1024.  This number includes both of pre-defined macros and those defined in header files.<br>
<br>
<li><samp>String literal longer than 509 bytes "very_very_long_string"</samp><br>
Expansion of a macro using the # operator has generated a string literal longer than 509 bytes.<br>
</ul>
<p>The following warnings are not issued in a skipped #if group.</p>
<ul>
<li><samp>More than 32 nesting of parens in #if expression</samp><br>
The depth of nested parentheses in a #if expression has exceeded 32.  This warning is issued only when it reaches 33.<br>
<br>
<li><samp>More than 31 parameters</samp><br>
The number of parameters of a macro definition has exceeded 31.<br>
<br>
<li><samp>Identifier longer than 31 bytes "very_very_long_name"</samp><br>
The length of an identifier has exceeded 31 bytes.<br>
</ul>
<p>With <tt>__STDC_VERSION__</tt> &gt;= 199901L, the Standard specified translation limits are as follows:</p>
<blockquote>
<table>
  <tr><th>Length of logical source line                               </th><td>4095 bytes</td></tr>
  <tr><th>Length of string literal, character constant, or header name</th><td>4095 bytes</td></tr>
  <tr><th>Identifier length                                           </th><td>63 characters</td></tr>
  <tr><th>Depth of nested #includes                                   </th><td>15</td></tr>
  <tr><th>Depth of nested #ifs, #ifdefs, or #ifndefs                  </th><td>63</td></tr>
  <tr><th>Depth of nested parentheses in #if expression               </th><td>63</td></tr>
  <tr><th>Number of macro parameters                                  </th><td>127</td></tr>
  <tr><th>Number of definable macros                                  </th><td>4095</td></tr>
</table>
</blockquote>
<p>Note that the length of a UCN or multi-byte-character in an identifier is counted as the number of characters, not bytes. (A queer stipulation)</p>
<p>When <b>mcpp</b> is invoked with the -+ option to specify C++ preprocessing, the Standard guideline of translation limits are as follows:</p>
<blockquote>
<table>
  <tr><th>Length of logical source line                               </th><td>65536 bytes</td></tr>
  <tr><th>Length of string literal, character constant, or header name</th><td>65536 bytes</td></tr>
  <tr><th>Identifier length                                           </th><td>1024 characters</td></tr>
  <tr><th>Depth of nested #includes                                   </th><td>256</td></tr>
  <tr><th>Depth of nested #ifs, #ifdefs, or #ifndefs                  </th><td>256</td></tr>
  <tr><th>Depth of nested parentheses in #if expression               </th><td>256</td></tr>
  <tr><th>Number of macro parameters                                  </th><td>256</td></tr>
  <tr><th>Number of definable macros                                  </th><td>65536</td></tr>
</table>
</blockquote>
<p>Note that <b>mcpp</b> allows the maximum number of macro parameters of 255.  So, when it reaches 256, <b>mcpp</b> issues an error.</p>
<p>The following warnings are excluded from class 1 and 2 because they are issued too frequently.</p>
<ul>
<li><samp>Converted 0x0c to a space</samp><br>
[FF], [VT], [CR] (other than in [CR][LF] sequence) in source code as token separators are converted into a space character.  How to deal with these token separators located on a directive line is undefined in Standard C.  If they are located in comments, string literals, or character constants, <b>mcpp</b> does not convert them. (Of course, <b>mcpp</b> can do so, but I do not want <b>mcpp</b> to impose a greater restriction on a character set used since it essentially depends on the compiler-proper.)  On the other hand, [TAB] as a token separator is converted into a space character, but no warning is issued because it does not affect compilation at all. ([TAB] means nothing but a space to both of preprocessor and compiler-proper.) [FF] are found sometimes in actual source to indicate "end of page".  This is not a recommendable style.<br>
<br>
<li><samp>Undefined symbol</samp><br>
In #if line the identifier "name" is not defined as a macro.  It is evaluated to zero.  This is not an error at all, but may be a program bug.  No warning is issued to an argument of a #if defined.  This warning can be avoided by writing #if defined name &amp;&amp; (name ..), instead of #if name .., or by invoking <b>mcpp</b> with the -D name=0 option.  C++ gives "true" and "false" tokens special treatment and evaluates to 1 and 0, respectively, without a warning.<br>
<br>
<li><samp>Multi-character wide character constant L'ab' isn't portable</samp><br>
A wide character constant value varies even among compiler systems using the same character set because the encoding scheme of wide character constants and how to evaluate multi-characters depend on compiler systems.  Therefore, #if expressions using them do not provide portability.  The only <i>STD</i> mode issues this warning.  <i>POSTSTD</i> mode does not permit character literal in #if expression, so this causes an error. (The next item is also treated the same way.)<br>
<br>
<li><samp>Multi-character or multi-byte character constant 'XY' isn't portable</samp><br>
Since how to evaluate the value of a multi-character or multi-byte character constant depends on compiler systems,  #if expressions using them do not provide portability.  The only <i>STD</i> mode issues this warning.<br>
</ul>
<p>The following two warnings are issued only in Standard mode.</p>
<ul>
<li><samp>Macro with mixing of ## and # operators isn't portable</samp><br>
A function-like macro has a token sequence of "## #" in the replacement list.  This sequence of two operators lack of portability because their priority is unspecified in Standard C.  <b>mcpp</b> takes precedence # over ##.  Note that if a function-like macro has a token sequence in the reverse order "# ##", <b>mcpp</b> regards it as an error because the operand of the # operator must be a parameter.<br>
<br>
<li><samp>Macro with multiple ## operators isn't portable</samp><br>
A macro definition has only one token or parameter inserted between ## operators in the replacement list.  This macro may lack of portability because the evaluation order of ## operators is unspecified in Standard C.  <b>mcpp</b> applies the ## operator from left to right.<br>
</ul>
<p>This warning is only with -K option in <i>STD</i> mode.</p>
<ul>
<li><samp>Too long comment, discarded up to here</samp><br>
The comment is longer than 256 lines, <b>mcpp</b> discards it up to here.
It is not a good style to include long document in source.<br>
</ul>
<br>

<h2><a name="5.8" href="#toc.5.8">5.8. Warnings (Class 8)</a></h2>
<p>There is little chance that the indicated source code contains a bug, but these messages are issued to call attention to it.  <b>mcpp</b> invoked with the -W8 option issues these warnings.</p>
<p>In a skipped #if group, whether preprocessing directives, such as #ifdef, #ifndef, #elif, #else, and #endif, are balanced or not is checked.  However, <b>mcpp</b> invoked with the -W8 option also checks non-conforming or unknown directives.  Standard mode issues a warning when the depth of nested #ifs exceeds 8.</p>
<ul>
<li><samp>Illegal #directive "+" (in skipped block)</samp><br>
<li><samp>Unknown #directive "pseudo-directive" (in skipped block)</samp><br>
<li><samp>More than 8 nesting of #if (#ifdef) sections (in skipped block)</samp><br>
<li><samp>#include_next is not allowed by Standard (in skipped block)</samp><br>
<li><samp>#warning is not allowed by Standard (in skipped block)</samp><br>
</ul>
<p>The following warnings are related to #if expression.  Given an expression of <samp>#if a || b</samp>, for example, if "a" is true, "b" is not evaluated.  However, <b>mcpp</b> invoked with -W8 issues a warning to non-evaluated sub-expressions, in which case, the note saying "in non-evaluated sub-expression" is appended.</p>
<ul>
<li><samp>Constant "123456789012345678901" is out of range</samp><br>
<li><samp>Constant "123456789012" is out of range of (unsigned) long</samp><br>
<li><samp>LL suffix is used in other than C99 mode "123LL"</samp><br>
<li><samp>I64 suffix is used in other than C99 mode "123i64"</samp><br>
<li><samp>Shift count "40" is larger than bit count of long</samp><br>
<li><samp>Integer character constant 'abcdefghi' is out of range</samp><br>
<li><samp>Integer character constant 'abcde' is out of range of unsigned long</samp><br>
<li><samp>Wide character constant L'abcdef' is out of range</samp><br>
<li><samp>Wide character constant L'abc' is out of range of unsigned long</samp><br>
<li><samp>8 bits can't represent escape sequence 'x123'</samp><br>
<li><samp>16 bits can't represent escape sequence L'x12345'</samp><br>
<li><samp>Division by zero</samp><br>
<li><samp>Undefined symbol "name", evaluated to 0</samp><br>
<li><samp>sizeof: Unknown type "type"</samp><br>
<li><samp>sizeof: Illegal type combination with "type"</samp><br>
<li><samp>Multi-character wide character constant L'ab' isn't portable</samp><br>
<li><samp>Multi-character or multi-byte character constant 'XY' isn't portable</samp><br>
<li><samp>Undefined escape sequence '\x'</samp><br>
<li><samp>UCN cannot specify the value "0000007f"</samp><br>
<li><samp>Negative value "-1" is converted to positive "18446744073709551615"</samp><br>
<li><samp>Result of "op" is out of range</samp><br>
<li><samp>Result of "op" is out of range of (unsigned) long</samp><br>
<li><samp>Illegal shift count "-1"</samp><br>
<li><samp>"op" of negative number isn't portable</samp><br>
<br>
<li><samp>sizeof is disallowed in C Standard</samp><br>
The purpose of this warning is to remind users of the fact that Standard C does not allow for #if sizeof, although pre-Standard mode implements it.<br>
<br>
<li><samp>"MACRO" wasn't defined</samp><br>
An undefined name is specified with #undef.  Standard C does not regard it as an error.<br>
<br>
<li><samp>Macro "macro" needs arguments</samp><br>
A token with the same name as a macro with arguments appears in a stand alone manner.  <b>mcpp</b> does not expand it and leave it as it is.  The only pre-Standard mode issues this warning. (Standard mode does not issue a warning since such a token does not cause any problem.)<br>
<br>
<li><samp>Replacement text "sub(" of macro "head" involved subsequent text</samp><br>
Rescanning of the replacement list "sub(" of the macro "head" has involved the text succeeding the macro invocation.  <i>COMPAT</i> mode issues this warning only on class 8, whereas <i>STD</i> mode issues on class 1.<br>
</ul>
<br>

<h2><a name="5.9" href="#toc.5.9">5.9. Warnings (Class 16)</a></h2>
<p>Trigraphs and digraphs are not used at all in an environment where they are not need to.  If they are found in such an environment, attention needs to be paid.  The purpose of the -W16 option is to find such trigraphs and digraphs.  On the other hand, these warnings are very bothersome in an environment where trigraphs or digraphs are used on a regular basis because they are issued very frequently.  For this reason, I set up a separate class for these warnings.  Anyway, <b>mcpp</b> issues these messages only in the state where the trigraphs or digraphs are enabled.  Digraph is for Standard mode only, and trigraph is for <i>STD</i> mode only.</p>
<ul>
<li><samp>2 trigraph(s) converted</samp><br>
Two trigraph sequences in this physical line have been converted.  Does the programmer really intend to write trigraph?<br>
<br>
<li><samp>2 digraph(s) converted</samp><br>
Two digraph sequences in this line have been converted.  Does the programmer really intend to write digraphs?  <b>mcpp</b> compiled with <i>HAVE_DIGRAPHS</i> == <i>FALSE</i> in <i>STD</i> mode converts a digraph into a regular token in the following manner after preprocessing:<br>
<pre>
&lt;% -&gt; {      &lt;: -&gt; [      %:    -&gt; #
%&gt; -&gt; }      :&gt; -&gt; ]      %:%:  -&gt; ##
</pre>
Therefore, the compiler-proper is not necessary to be able to handle digraphs.  However, <i>POSTSTD</i> mode converts a digraph into a regular pp-token during the translation phase 1.  The difference of this behavior between the modes appears when a # operator converts a digraph into a string; <i>STD</i> mode directly converts a digraph sequence into a string, while <i>POSTSTD</i> mode converts it into a regular pp-token, and then into a string.  In addition, if a string literal contains a character sequence which is equivalent to a digraph sequence, <i>STD</i> mode does not convert it, while <i>POSTSTD</i> mode converts it into a character sequence of the corresponding pp-tokens.<br>
<br>
<i>STD</i> mode does not issue a warning to a digraph that appears on a preprocessing-directive line and disappears in a due course because this warning is issued only to converted digraphs.<br>
</ul>
<br>

<h2><a name="5.10" href="#toc.5.10">5.10. Diagnostic Messages Index</a></h2>
<table border='1' frame='below'><tr><th rowspan='2'>Diagnostic Message</th><th rowspan='2'>Fatal<br>error</th><th rowspan='2'>Error</th><th colspan='5'>Warning class</th></tr>
<th>1</th><th>2</th><th>4</th><th>8</th><th>16</th></tr>
<tr><td><samp>"..." isn't the last parameter</samp></td><td></td><td><a href='#5.4.7'>5.4.7</td></tr>
<tr><td><samp>"/*" in comment</samp></td><td></td><td></td><td><a href='#5.5.1'>5.5.1</td></tr>
<tr><td><samp>"and" is defined as macro</samp></td><td></td><td></td><td><a href='#5.5.3'>5.5.3</td></tr>
<tr><td><samp>"defined" shouldn't be defined</samp></td><td></td><td><a href='#5.4.7'>5.4.7</td></tr>
<tr><td><samp>"MACRO" has not been defined</samp></td><td></td><td></td><td><a href='#5.5.3'>5.5.3</td></tr>
<tr><td><samp>"MACRO" has not been pushed</samp></td><td></td><td></td><td><a href='#5.5.3'>5.5.3</td></tr>
<tr><td><samp>"MACRO" is already pushed</samp></td><td></td><td></td><td><a href='#5.5.3'>5.5.3</td></tr>
<tr><td><samp>"MACRO" wasn't defined</samp></td><td></td><td></td><td></td><td></td><td></td><td><a href='#5.8'>5.8</td></tr>
<tr><td><samp>"op" of negative number isn't portable</samp></td><td></td><td></td><td><a href='#5.5.4'>5.5.4</td><td></td><td></td><td><a href='#5.8'>5.8</td></tr>
<tr><td><samp>"__STDC__" shouldn't be redefined</samp></td><td></td><td><a href='#5.4.7'>5.4.7</td></tr>
<tr><td><samp>"__STDC__" shouldn't be undefined</samp></td><td></td><td><a href='#5.4.8'>5.4.8</td></tr>
<tr><td><samp>"__VA_ARGS__" without corresponding "..."</samp></td><td></td><td><a href='#5.4.7'>5.4.7</td></tr>
<tr><td><samp>"__VA_ARGS__" cannot be used in GCC2-spec variadic macro</samp></td><td></td><td><a href='#5.4.7'>5.4.7</td></tr>
<tr><td><samp>## after ##</samp></td><td></td><td><a href='#5.4.7'>5.4.7</td></tr>
<tr><td><samp>#error</samp></td><td></td><td><a href='#5.4.10'>5.4.10</td></tr>
<tr><td><samp>#include_next is not allowed by Standard</samp></td><td></td><td></td><td></td><td><a href='#5.6'>5.6</td><td></td><td><a href='#5.8'>5.8</td></tr>
<tr><td><samp>#warning</samp></td><td></td><td></td><td><a href='#5.5.7'>5.5.7</td></tr>
<tr><td><samp>'$' in identifier "THIS$AND$THAT"</samp></td><td></td><td></td><td></td><td><a href='#5.6'>5.6</td></tr>
<tr><td><samp>16 bits can't represent escape sequence L'\x12345'</samp></td><td></td><td><a href='#5.4.6'>5.4.6</td><td></td><td></td><td></td><td><a href='#5.8'>5.8</td></tr>
<tr><td><samp>2 digraph(s) converted</samp></td><td></td><td></td><td></td><td></td><td></td><td></td><td><a href='#5.9'>5.9</td></tr>
<tr><td><samp>2 trigraph(s) converted</samp></td><td></td><td></td><td></td><td></td><td></td><td></td><td><a href='#5.9'>5.9</td></tr>
<tr><td><samp>8 bits can't represent escape sequence '\x123'</samp></td><td></td><td><a href='#5.4.6'>5.4.6</td><td></td><td></td><td></td><td><a href='#5.8'>5.8</td></tr>
<tr><td><samp>_Pragma operator found in directive line</samp></td><td></td><td><a href='#5.4.12'>5.4.12</td></tr>
<tr><td><samp>Already seen #else at line 123</samp></td><td></td><td><a href='#5.4.3'>5.4.3</td></tr>
<tr><td><samp>Bad defined syntax</samp></td><td></td><td><a href='#5.4.5'>5.4.5</td></tr>
<tr><td><samp>Bad pop_macro syntax</samp></td><td></td><td></td><td><a href='#5.5.3'>5.5.3</td></tr>
<tr><td><samp>Bad push_macro syntax</samp></td><td></td><td></td><td><a href='#5.5.3'>5.5.3</td></tr>
<tr><td><samp>Buffer overflow expanding macro "macro" at "something"</samp></td><td></td><td><a href='#5.4.9'>5.4.9</td></tr>
<tr><td><samp>Buffer overflow scanning token "token"</samp></td><td><a href='#5.3.3'>5.3.3</td></tr>
<tr><td><samp>Bug:</samp></td><td><a href='#5.3.1'>5.3.1</td></tr>
<tr><td><samp>Can't open include file "file-name"</samp></td><td></td><td><a href='#5.4.11'>5.4.11</td></tr>
<tr><td><samp>Can't use a character constant 'a'</samp></td><td></td><td><a href='#5.4.5'>5.4.5</td></tr>
<tr><td><samp>Can't use a string literal "string"</samp></td><td></td><td><a href='#5.4.5'>5.4.5</td></tr>
<tr><td><samp>Can't use the character 0x24</samp></td><td></td><td><a href='#5.4.5'>5.4.5</td></tr>
<tr><td><samp>Can't use the operator "++"</samp></td><td></td><td><a href='#5.4.5'>5.4.5</td></tr>
<tr><td><samp>Constant "123456789012" is out of range of (unsigned) long</samp></td><td></td><td></td><td><a href='#5.5.4'>5.5.4</td><td><a href='#5.6'>5.6</td><td></td><td><a href='#5.8'>5.8</td></tr>
<tr><td><samp>Constant "1234567890123456789012" is out of range</samp></td><td></td><td><a href='#5.4.6'>5.4.6</td><td></td><td></td><td></td><td><a href='#5.8'>5.8</td></tr>
<tr><td><samp>Converted 0x0c to a space</samp></td><td></td><td></td><td></td><td></td><td><a href='#5.7'>5.7</td></tr>
<tr><td><samp>Converted [CR+LF] to [LF]</samp></td><td></td><td></td><td><a href='#5.5.1'>5.5.1</td><td><a href='#5.6'>5.6</td></tr>
<tr><td><samp>Converted \ to /</samp></td><td></td><td></td><td></td><td><a href='#5.6'>5.6</td></tr>
<tr><td><samp>Division by zero</samp></td><td></td><td><a href='#5.4.6'>5.4.6</td><td></td><td></td><td></td><td><a href='#5.8'>5.8</td></tr>
<tr><td><samp>Duplicate parameter names "a"</samp></td><td></td><td><a href='#5.4.7'>5.4.7</td></tr>
<tr><td><samp>Empty argument in macro call "MACRO( a, ,"</samp></td><td></td><td></td><td></td><td><a href='#5.6'>5.6</td></tr>
<tr><td><samp>Empty character constant ''</samp></td><td></td><td><a href='#5.4.1'>5.4.1</td><td><a href='#5.5.1'>5.5.1</td></tr>
<tr><td><samp>Empty parameter</samp></td><td></td><td><a href='#5.4.7'>5.4.7</td></tr>
<tr><td><samp>End of file with no newline, supplemented the newline</samp></td><td></td><td></td><td><a href='#5.5.2'>5.5.2</td></tr>
<tr><td><samp>End of file with unterminated #asm block started at line 123</samp></td><td></td><td><a href='#5.4.2'>5.4.2</td><td><a href='#5.5.2'>5.5.2</td></tr>
<tr><td><samp>End of file with unterminated comment, terminated the comment</samp></td><td></td><td></td><td><a href='#5.5.2'>5.5.2</td></tr>
<tr><td><samp>End of file with \, deleted the \</samp></td><td></td><td></td><td><a href='#5.5.2'>5.5.2</td></tr>
<tr><td><samp>End of file within #if (#ifdef) section started at line 123</samp></td><td></td><td><a href='#5.4.2'>5.4.2</td><td><a href='#5.5.2'>5.5.2</td></tr>
<tr><td><samp>End of file within macro call started at line 123</samp></td><td></td><td><a href='#5.4.2'>5.4.2</td><td><a href='#5.5.2'>5.5.2</td></tr>
<tr><td><samp>Excessive ")"</samp></td><td></td><td><a href='#5.4.5'>5.4.5</td></tr>
<tr><td><samp>Excessive token sequence "junk"</samp></td><td></td><td><a href='#5.4.4'>5.4.4</td><td><a href='#5.5.3'>5.5.3</td></tr>
<tr><td><samp>File read error</samp></td><td><a href='#5.3.2'>5.3.2</td></tr>
<tr><td><samp>File write error</samp></td><td><a href='#5.3.2'>5.3.2</td></tr>
<tr><td><samp>GCC2-spec variadic macro is defined</samp></td><td></td><td></td><td></td><td><a href='#5.6'>5.6</td></tr>
<tr><td><samp>Header-name enclosed by <, > is an obsolescent feature <stdio.h></samp></td><td></td><td></td><td></td><td><a href='#5.6'>5.6</td></tr>
<tr><td><samp>I64 suffix is used in other than C99 mode "123i64"</samp></td><td></td><td></td><td></td><td><a href='#5.6'>5.6</td><td></td><td><a href='#5.8'>5.8</td></tr>
<tr><td><samp>Identifier longer than 31 bytes "very_very_long_name"</samp></td><td></td><td></td><td></td><td></td><td><a href='#5.7'>5.7</td></tr>
<tr><td><samp>Ignored #ident</samp></td><td></td><td></td><td><a href='#5.5.3'>5.5.3</td><td></td><td></td><td><a href='#5.8'>5.8</td></tr>
<tr><td><samp>Ignored #sccs</samp></td><td></td><td></td><td><a href='#5.5.3'>5.5.3</td><td></td><td></td><td><a href='#5.8'>5.8</td></tr>
<tr><td><samp>Illegal #directive "123"</samp></td><td></td><td><a href='#5.4.4'>5.4.4</td><td><a href="#5.5.3">5.5.3</td><td></td><td></td><td><a href='#5.8'>5.8</td></tr>
<tr><td><samp>Illegal control character 0x1b in quotation</samp></td><td></td><td></td><td><a href='#5.5.1'>5.5.1</td></tr>
<tr><td><samp>Illegal control character 0x1b, skipped the character</samp></td><td></td><td><a href='#5.4.1'>5.4.1</td></tr>
<tr><td><samp>Illegal digit in octal number "089"</samp></td><td></td><td></td><td><a href='#5.5.1'>5.5.1</td></tr>
<tr><td><samp>Illegal multi-byte character sequence "XY" in quotation</samp></td><td></td><td></td><td><a href='#5.5.1'>5.5.1</td></tr>
<tr><td><samp>Illegal multi-byte character sequence "XY"</samp></td><td></td><td><a href='#5.4.1'>5.4.1</td></tr>
<tr><td><samp>Illegal parameter "123"</samp></td><td></td><td><a href='#5.4.7'>5.4.7</td></tr>
<tr><td><samp>Illegal shift count "-1"</samp></td><td></td><td></td><td><a href='#5.5.4'>5.5.4</td><td></td><td></td><td><a href='#5.8'>5.8</td></tr>
<tr><td><samp>Illegal UCN sequence</samp></td><td></td><td><a href='#5.4.1'>5.4.1</td></tr>
<tr><td><samp>In #asm block started at line 123</samp></td><td></td><td><a href='#5.4.3'>5.4.3</td></tr>
<tr><td><samp>Integer character constant 'abcde' is out of range of unsigned long</samp></td><td></td><td></td><td><a href='#5.5.4'>5.5.4</td><td><a href='#5.6'>5.6</td><td></td><td><a href='#5.8'>5.8</td></tr>
<tr><td><samp>Integer character constant 'abcdefghi' is out of range</samp></td><td></td><td><a href='#5.4.6'>5.4.6</td><td></td><td></td><td></td><td><a href='#5.8'>5.8</td></tr>
<tr><td><samp>Less than necessary N argument(s) in macro call "macro( a)"</samp></td><td></td><td><a href='#5.4.9'>5.4.9</td><td><a href='#5.5.5'>5.5.5</td></tr>
<tr><td><samp>Line number "0x123" isn't a decimal digits sequence</samp></td><td></td><td><a href='#5.4.4'>5.4.4</td><td><a href='#5.5.6'>5.5.6</td></tr>
<tr><td><samp>Line number "2147483648" is out of range of 1,2147483647</samp></td><td></td><td><a href='#5.4.4'>5.4.4</td></tr>
<tr><td><samp>Line number "32768" got beyond range</samp></td><td></td><td></td><td><a href='#5.5.6'>5.5.6</td></tr>
<tr><td><samp>Line number "32768" is out of range of 1,32767</samp></td><td></td><td></td><td><a href='#5.5.6'>5.5.6</td></tr>
<tr><td><samp>Line number "32769" is out of range</samp></td><td></td><td></td><td><a href='#5.5.6'>5.5.6</td></tr>
<tr><td><samp>LL suffix is used in other than C99 mode "123LL"</samp></td><td></td><td></td><td><a href='#5.5.4'>5.5.4</td><td><a href='#5.6'>5.6</td><td></td><td><a href='#5.8'>5.8</td></tr>
<tr><td><samp>Logical source line longer than 509 bytes</samp></td><td></td><td></td><td></td><td></td><td><a href='#5.7'>5.7</td></tr>
<tr><td><samp>Macro "MACRO" is expanded to "defined"</samp></td><td></td><td></td><td><a href='#5.5.4'>5.5.4</td></tr>
<tr><td><samp>Macro "MACRO" is expanded to "sizeof"</samp></td><td></td><td></td><td><a href='#5.5.4'>5.5.4</td></tr>
<tr><td><samp>Macro "MACRO" is expanded to 0 token</samp></td><td></td><td></td><td><a href='#5.5.4'>5.5.4</td></tr>
<tr><td><samp>Macro "macro" needs arguments</samp></td><td></td><td></td><td></td><td></td><td></td><td><a href='#5.8'>5.8</td></tr>
<tr><td><samp>Macro started at line 123 swallowed directive-like line</samp></td><td></td><td></td><td><a href='#5.5.5'>5.5.5</td></tr>
<tr><td><samp>Macro with mixing of ## and # operators isn't portable</samp></td><td></td><td></td><td></td><td></td><td><a href='#5.7'>5.7</td></tr>
<tr><td><samp>Macro with multiple ## operators isn't portable</samp></td><td></td><td></td><td></td><td></td><td><a href='#5.7'>5.7</td></tr>
<tr><td><samp>Misplaced ":", previous operator is "+"</samp></td><td></td><td><a href='#5.4.5'>5.4.5</td></tr>
<tr><td><samp>Misplaced constant "12"</samp></td><td></td><td><a href='#5.4.5'>5.4.5</td></tr>
<tr><td><samp>Missing ")"</samp></td><td></td><td><a href='#5.4.5'>5.4.5</td></tr>
<tr><td><samp>Missing "," or ")" in parameter list "(a,b"</samp></td><td></td><td><a href='#5.4.7'>5.4.7</td></tr>
<tr><td><samp>More than 1024 macros defined</samp></td><td></td><td></td><td></td><td></td><td><a href='#5.7'>5.7</td></tr>
<tr><td><samp>More than 31 parameters</samp></td><td></td><td></td><td></td><td></td><td><a href='#5.7'>5.7</td></tr>
<tr><td><samp>More than 32 nesting of parens in #if expression</samp></td><td></td><td></td><td></td><td></td><td><a href='#5.7'>5.7</td></tr>
<tr><td><samp>More than 8 nesting of #if (#ifdef) sections</samp></td><td></td><td></td><td></td><td></td><td><a href='#5.7'>5.7</td><td><a href='#5.8'>5.8</td></tr>
<tr><td><samp>More than 8 nesting of #include</samp></td><td></td><td></td><td></td><td></td><td><a href='#5.7'>5.7</td></tr>
<tr><td><samp>More than <i>BLK_NEST</i> nesting of #if (#ifdef) sections</samp></td><td><a href='#5.3.3'>5.3.3</td></tr>
<tr><td><samp>More than <i>INCLUDE_NEST</i> nesting of #include</samp></td><td><a href='#5.3.3'>5.3.3</td></tr>
<tr><td><samp>More than necessary N argument(s) in macro call "macro( a, b, c)</samp></td><td></td><td><a href='#5.4.9'>5.4.9</td></tr>
<tr><td><samp>More than <i>NEXP</i>*2-1 constants stacked at "12"</samp></td><td></td><td><a href='#5.4.5'>5.4.5</td></tr>
<tr><td><samp>More than <i>NEXP</i>*3-1 operators and parens stacked at "+"</samp></td><td></td><td><a href='#5.4.5'>5.4.5</td></tr>
<tr><td><samp>More than <i>NMACPARS</i> parameters</samp></td><td></td><td><a href='#5.4.7'>5.4.7</td></tr>
<tr><td><samp>Multi-character or multi-byte character constant 'XY' isn't portable</samp></td><td></td><td></td><td></td><td></td><td><a href='#5.7'>5.7</td><td><a href='#5.8'>5.8</td></tr>
<tr><td><samp>Multi-character wide character constant L'ab' isn't portable</samp></td><td></td><td></td><td></td><td></td><td><a href='#5.7'>5.7</td><td><a href='#5.8'>5.8</td></tr>
<tr><td><samp>Negative value "-1" is converted to positive "18446744073709551615"</samp></td><td></td><td></td><td><a href='#5.5.4'>5.5.4</td><td></td><td></td><td><a href='#5.8'>5.8</td></tr>
<tr><td><samp>No argument</samp></td><td></td><td><a href='#5.4.4'>5.4.4</td><td><a href='#5.5.3'>5.5.3</td></tr>
<tr><td><samp>No header name</samp></td><td></td><td><a href='#5.4.4'>5.4.4</td></tr>
<tr><td><samp>No identifier</samp></td><td></td><td><a href='#5.4.4'>5.4.4</td></tr>
<tr><td><samp>No line number</samp></td><td></td><td><a href='#5.4.4'>5.4.4</td></tr>
<tr><td><samp>No space between macro name "MACRO" and repl-text</samp></td><td></td><td></td><td><a href='#5.5.3'>5.5.3</td></tr>
<tr><td><samp>No sub-directive</samp></td><td></td><td></td><td><a href='#5.5.3'>5.5.3</td></tr>
<tr><td><samp>No token after ##</samp></td><td></td><td><a href='#5.4.7'>5.4.7</td></tr>
<tr><td><samp>No token before ##</samp></td><td></td><td><a href='#5.4.7'>5.4.7</td></tr>
<tr><td><samp>Not a file name "name"</samp></td><td></td><td><a href='#5.4.4'>5.4.4</td></tr>
<tr><td><samp>Not a formal parameter "id"</samp></td><td></td><td><a href='#5.4.7'>5.4.7</td></tr>
<tr><td><samp>Not a header name "UNDEFINED_MACRO"</samp></td><td></td><td><a href='#5.4.4'>5.4.4</td></tr>
<tr><td><samp>Not a line number "name"</samp></td><td></td><td><a href='#5.4.4'>5.4.4</td></tr>
<tr><td><samp>Not a valid preprocessing token "+12"</samp></td><td></td><td><a href='#5.4.9'>5.4.9</td><td></td><td><a href='#5.6'>5.6</td></tr>
<tr><td><samp>Not a valid string literal</samp></td><td></td><td><a href='#5.4.9'>5.4.9</td></tr>
<tr><td><samp>Not an identifier "123"</samp></td><td></td><td><a href='#5.4.4'>5.4.4</td><td><a href='#5.5.3'>5.5.3</td></tr>
<tr><td><samp>Not an integer "1.23"</samp></td><td></td><td><a href='#5.4.5'>5.4.5</td></tr>
<tr><td><samp>Not in a #if (#ifdef) section</samp></td><td></td><td><a href='#5.4.3'>5.4.3</td></tr>
<tr><td><samp>Not in a #if (#ifdef) section in a source file</samp></td><td></td><td><a href='#5.4.3'>5.4.3</td><td><a href='#5.5.3'>5.5.3</td></tr>
<tr><td><samp>Operand of _Pragma() is not a string literal</samp></td><td></td><td><a href='#5.4.12'>5.4.12</td></tr>
<tr><td><samp>Operator ">" in incorrect context</samp></td><td></td><td><a href='#5.4.5'>5.4.5</td></tr>
<tr><td><samp>Old style predefined macro "linux" is used</samp></td><td></td><td></td><td><a href='#5.5.5'>5.5.5</td></tr>
<tr><td><samp>Out of memory (required size is 0x1234 bytes)</samp></td><td><a href='#5.3.2'>5.3.2</td></tr>
<tr><td><samp>Parsed "//" as comment</samp></td><td></td><td></td><td></td><td><a href='#5.6'>5.6</td></tr>
<tr><td><samp>Preprocessing assertion failed</samp></td><td></td><td><a href='#5.4.10'>5.4.10</td></tr>
<tr><td><samp>Quotation longer than 509 bytes "very_very_long_string"</samp></td><td></td><td></td><td></td><td></td><td><a href='#5.7'>5.7</td></tr>
<tr><td><samp>Recursive macro definition of "macro" to "macro"</samp></td><td></td><td><a href='#5.4.9'>5.4.9</td></tr>
<tr><td><samp>Removed ',' preceding the absent variable argument</samp></td><td></td><td></td><td><a href='#5.5.5'>5.5.5</td></tr>
<tr><td><samp>Replacement text "sub(" of macro "head" involved subsequent text</samp></td><td></td><td></td><td><a href='#5.5.5'>5.5.5</td><td></td><td></td><td><a href='#5.8'>5.8</td></tr>
<tr><td><samp>Rescanning macro "macro" more than <i>RESCAN_LIMIT</i> times at "something"</samp></td><td></td><td><a href='#5.4.9'>5.4.9</td></tr>
<tr><td><samp>Result of "op" is out of range</samp></td><td></td><td><a href='#5.4.6'>5.4.6</td><td></td><td></td><td></td><td><a href='#5.8'>5.8</td></tr>
<tr><td><samp>Result of "op" is out of range of (unsigned) long</samp></td><td></td><td></td><td><a href='#5.5.4'>5.5.4</td><td><a href='#5.6'>5.6</td><td></td><td><a href='#5.8'>5.8</td></tr>
<tr><td><samp>Shift count "40" is larger than bit count of long</samp></td><td></td><td></td><td><a href='#5.5.4'>5.5.4</td><td><a href='#5.6'>5.6</td><td></td><td><a href='#5.8'>5.8</td></tr>
<tr><td><samp>sizeof is disallowed in C Standard</samp></td><td></td><td></td><td></td><td></td><td></td><td><a href='#5.8'>5.8</td></tr>
<tr><td><samp>sizeof: Illegal type combination with "type"</samp></td><td></td><td><a href='#5.4.6'>5.4.6</td><td></td><td></td><td></td><td><a href='#5.8'>5.8</td></tr>
<tr><td><samp>sizeof: No type specified</samp></td><td></td><td><a href='#5.4.5'>5.4.5</td></tr>
<tr><td><samp>sizeof: Syntax error</samp></td><td></td><td><a href='#5.4.5'>5.4.5</td></tr>
<tr><td><samp>sizeof: Unknown type "type"</samp></td><td></td><td><a href='#5.4.6'>5.4.6</td><td></td><td></td><td></td><td><a href='#5.8'>5.8</td></tr>
<tr><td><samp>Skipped the #pragma line</samp></td><td></td><td></td><td></td><td><a href='#5.6'>5.6</td></tr>
<tr><td><samp>String literal longer than 509 bytes "very_very_long_string"</samp></td><td></td><td></td><td></td><td></td><td><a href='#5.7'>5.7</td></tr>
<tr><td><samp>The macro is redefined</samp></td><td></td><td></td><td><a href='#5.5.4'>5.5.4</td></tr>
<tr><td><samp>This is not a preprocessed source</samp></td><td><a href='#5.3.4'>5.3.4</td></tr>
<tr><td><samp>This preprocessed file is corrupted</samp></td><td><a href='#5.3.4'>5.3.4</td></tr>
<tr><td><samp>Too long comment, discarded up to here</samp></td><td></td><td></td><td></td><td></td><td><a href='#5.7'>5.7</td></tr>
<tr><td><samp>Too long header name "long-file-name"</samp></td><td><a href='#5.3.3'>5.3.3</td></tr>
<tr><td><samp>Too long identifier, truncated to "very_long_identifier"</samp></td><td></td><td></td><td><a href='#5.5.1'>5.5.1</td></tr>
<tr><td><samp>Too long line spliced by comments</samp></td><td><a href='#5.3.3'>5.3.3</td></tr>
<tr><td><samp>Too long logical line</samp></td><td><a href='#5.3.3'>5.3.3</td></tr>
<tr><td><samp>Too long number token "12345678901234"</samp></td><td><a href='#5.3.3'>5.3.3</td></tr>
<tr><td><samp>Too long pp-number token "1234toolong"</samp></td><td><a href='#5.3.3'>5.3.3</td></tr>
<tr><td><samp>Too long quotation "long-string"</samp></td><td><a href='#5.3.3'>5.3.3</td></tr>
<tr><td><samp>Too long source line</samp></td><td><a href='#5.3.3'>5.3.3</td></tr>
<tr><td><samp>Too long token</samp></td><td><a href='#5.3.3'>5.3.3</td></tr>
<tr><td><samp>Too many magics nested in macro argument</samp></td><td></td><td><a href='#5.4.9'>5.4.9</td></tr>
<tr><td><samp>Too many nested macros in tracing MACRO</samp></td><td></td><td><a href='#5.4.9'>5.4.9</td></tr>
<tr><td><samp>UCN cannot specify the value "0000007f"</samp></td><td></td><td><a href='#5.4.1'>5.4.1</td><td></td><td></td><td></td><td><a href='#5.8'>5.8</td></tr>
<tr><td><samp>Undefined escape sequence '\x'</samp></td><td></td><td></td><td><a href='#5.5.4'>5.5.4</td><td></td><td></td><td><a href='#5.8'>5.8</td></tr>
<tr><td><samp>Undefined symbol "name", evaluated to 0</samp></td><td></td><td></td><td></td><td></td><td><a href='#5.7'>5.7</td><td><a href='#5.8'>5.8</td></tr>
<tr><td><samp>Unknown #directive "pseudo-directive"</samp></td><td></td><td><a href='#5.4.4'>5.4.4</td><td><a href='#5.5.4'>5.5.4</td><td></td><td></td><td><a href='#5.8'>5.8</td></tr>
<tr><td><samp>Unknown argument "name"</samp></td><td></td><td></td><td><a href='#5.5.3'>5.5.3</td></tr>
<tr><td><samp>Unterminated character constant 't understand.</samp></td><td></td><td><a href='#5.4.1'>5.4.1</td></tr>
<tr><td><samp>Unterminated expression</samp></td><td></td><td><a href='#5.4.5'>5.4.5</td></tr>
<tr><td><samp>Unterminated header name <header.h</samp></td><td></td><td><a href='#5.4.1'>5.4.1</td></tr>
<tr><td><samp>Unterminated macro call "macro( a, (b,c)"</samp></td><td></td><td><a href='#5.4.9'>5.4.9</td></tr>
<tr><td><samp>Unterminated string literal</samp></td><td></td><td><a href='#5.4.1'>5.4.1</td></tr>
<tr><td><samp>Unterminated string literal, catenated to the next line</samp></td><td></td><td></td><td><a href='#5.5.1'>5.5.1</td></tr>
<tr><td><samp>Variable argument macro is defined</samp></td><td></td><td></td><td></td><td><a href='#5.6'>5.6</td></tr>
<tr><td><samp>Wide character constant L'abc' is out of range of unsigned long</samp></td><td></td><td></td><td><a href='#5.5.4'>5.5.4</td><td><a href='#5.6'>5.6</td><td></td><td><a href='#5.8'>5.8</td></tr>
<tr><td><samp>Wide character constant L'abc' is out of range</samp></td><td></td><td><a href='#5.4.6'>5.4.6</td><td></td><td></td><td></td><td><a href='#5.8'>5.8</td></tr>
</table><br>
<br>

<h1><a name="6" href="#toc.6">6. Reporting on Bugs and Others</a></h1>

<p>I have developed the Validation Suite to verify conformance of preprocessing to Standard C/C++, and released it along with <b>mcpp</b> source.  The Validation Suite is intended to allow you to verify all the Standard C preprocessing specifications.  Of course, I used the Validation Suite to check <b>mcpp</b>.  And what is more, I have compiled <b>mcpp</b> in many compiler systems to verify its behavior.  Therefore, I am confident that <b>mcpp</b> is now almost flawless, free of bugs and misinterpretation of specifications, however, I cannot deny the possibility that it still contains some bugs.</p>
<p>If you find a strange behavior, do not hesitate to let me know.  If you receive a diagnostic message saying "Bug: ...", it is undoubtedly a bug of <b>mcpp</b> or a compiler system. (Probably, it's <b>mcpp</b>'s.)  How illegal a user program may be, should <b>mcpp</b> lose control, it is <b>mcpp</b> that is to be blamed for it.</p>

<p>When you report a bug, please be sure to provide the following information:</p>
<ol>
<li>Compiler system <b>mcpp</b> is ported to.<br>
<li>A sample source (shorter is better) that allows reproduction of what looks like a bug.<br>
<li>Preprocessing results.<br>
</ol>

<p>Other than bugs, I would appreciate if you give me feedback on <b>mcpp</b> usage, diagnostic messages or this manual.</p>
<p>For your feedback or information, please post to "Open Discussion Forum" at:</p>
<p><a href="http://mcpp.sourceforge.net/"> http://mcpp.sourceforge.net/</a></p>
<p>or send via e-mail.</p>

</body>
</html>
